You are here: PSPad forum > English discussion forum > How to extract all links from plain text?
Goto Page: Previous1 2 3 4 5 Next
Posted by: pspad | Date: 2020-01-19 11:35 | IP: IP Logged
OK. In the Edit menu / special conversion are options:
URL -> text
text -> URL
But I found it doesn't work correctly for characters like Chinese. I will fix it and it will be available in next developer build.
I will check other conversion there if they are fully unicode ready
Posted by: maki | Date: 2020-01-19 13:32 | IP: IP Logged
PSPad incorrectly changes the encoding:
Percent-encoding to Unicode (Curent Encoding)
Should / work
Percent-encoding to Unicode (UTF-8)
Edited 1 time(s). Last edit at 2020-01-19 13:33 by maki.
Posted by: pspad | Date: 2020-01-19 14:40 | IP: IP Logged
I don't know what is percent-encoding and why it should be UTF-8? Where?
Please when you write anything, try to write it as I would be able answer you witout any other question
Posted by: maki | Date: 2020-01-19 15:54 | IP: IP Logged
In PSPad you will not see the name if it is Unicode.
However, choosing regular Unicode encoding will destroy Charset. UTF-8 must be here. URL encoding replaces unsafe ASCII characters with a "%" followed by two hexadecimal digits.
URLs cannot contain spaces. URL encoding normally replaces a space with a plus (+) sign or with %20.
Edited 1 time(s). Last edit at 2020-01-19 15:55 by maki.
Posted by: pspad | Date: 2020-01-19 16:20 | IP: IP Logged
What are you running to get URL safe encoded?
Posted by: maki | Date: 2020-01-21 11:46 | IP: IP Logged
@pspad - thanks for the help but your regex is good but still not perfect, you have to try to improve.
example detect invalid URL
wwwww.1.com
Posted by: maki | Date: 2020-01-21 12:39 | IP: IP Logged
maki:@pspad - thanks for the help but your regex is good but still not perfect, you have to try to improve.example detect invalid URL
wwwww.1.com
maybe???
w{3,3}
Posted by: pspad | Date: 2020-01-21 12:48 | IP: IP Logged
maki:maybe???
w{3,3}
What in case when there will be 1.com in the text only? Is it valid url or not?
www in the url isn't mandatory.
I think you are not able create universal expression what will handle all of your possible or broken URL.
Made it in 2 steps:
1. extract anything from your text what can looks like URL
2. check validity
Edited 1 time(s). Last edit at 2020-01-21 12:48 by pspad.
Posted by: maki | Date: 2020-01-22 09:22 | IP: IP Logged
As the old Russian proverb says: "You won't drink all the vodka, you won't have all the women, but you have to try!"
That's why it's not worth giving up. Genius apparently lies in simplicity, so he still tries.
mathiasbynens.be/demo/url-regex
Edited 1 time(s). Last edit at 2020-01-22 09:23 by maki.
Posted by: maki | Date: 2020-01-26 18:47 | IP: IP Logged
Quote:I think you are not able create universal expression what will handle all of your possible or broken URL.
Everything is possible and success has occurred today.
Of course, the syntax will not work in PSPad.
208 characters
<a alt="<>" href="http://www.stairws.com">
w5ww.com www.com http://com.pl
www.com
wwww.com
1.com
Test Notepad++ Work. This is the perfect regular expression (URL Match) for today.
Edited 3 time(s). Last edit at 2020-01-26 18:50 by maki.
Editor PSPad - freeware editor, © 2001 - 2023 Jan Fiala, Hosted by Webhosting TOJEONO.CZ, design by WebDesign PAY & SOFT, code Petr Dvořák, Privacy policy and GDPR