You are here: PSPad forum > English discussion forum > Re: extract urls preceeded by a predefined string in a html file

Re: extract urls preceeded by a predefined string in a html file

#1 extract urls preceeded by a predefined string in a html file

Posted by: Esgrimidor | Date: 2014-02-14 20:21 | IP: IP Logged

extract urls preceeded by a predefined string in a html file to a new file with one url per line.

How can i do that ?

Best Regards

--
Nice program indeed

Options: Reply | Quote | Up ^


#2 Re: extract urls preceeded by a predefined string in a html file

Posted by: pspad | Date: 2014-02-14 20:38 | IP: IP Logged

To extract URLs into new file use:
open search dialog
click on the button with ! next to search field and choose Find URL
Pres Copy button

Options: Reply | Quote | Up ^


#3 Re: extract urls preceeded by a predefined string in a html file

Posted by: Esgrimidor | Date: 2014-02-15 10:35 | IP: IP Logged

Running to try.

--
Nice program indeed

Options: Reply | Quote | Up ^


#4 Re: extract urls preceeded by a predefined string in a html file

Posted by: Esgrimidor | Date: 2014-02-15 10:53 | IP: IP Logged

I think is failing.
The file have 412812 lines and seems interact but don't finish to create the target file.

Now I try delete the string appear in the search box at the beginning and let empty....

go well now...
23.668 url extracted each one in a different line.

But
How can i filter if i want only the urls preceeded by a predefined string.... ?

--
Nice program indeed

Options: Reply | Quote | Up ^


#5 Re: extract urls preceeded by a predefined string in a html file

Posted by: pspad | Date: 2014-02-15 11:25 | IP: IP Logged

Modify the regular expression by adding string on the begin what will idenfify what you want

Options: Reply | Quote | Up ^


#6 Re: extract urls preceeded by a predefined string in a html file

Posted by: Esgrimidor | Date: 2014-02-15 21:08 | IP: IP Logged

I will try and comment. I am not used to regular expressions.

Best Regards

--
Nice program indeed

Options: Reply | Quote | Up ^


#7 Re: extract urls preceeded by a predefined string in a html file

Posted by: pspad | Date: 2014-02-15 21:37 | IP: IP Logged

If you want help from us, you need to provide some example of your lines. Sory, but we are not mind readers winking smiley

Options: Reply | Quote | Up ^


#8 Re: extract urls preceeded by a predefined string in a html file

Posted by: Esgrimidor | Date: 2014-02-19 12:16 | IP: IP Logged

Try again.

This is part of the file to extract urls ......

www.proof.com
[Ref]http://itv.com
gent asom.net
[Ref]http://www.imagen.org

and the predefined string is "[Ref]"

How can i do that ?

Best Regards

--
Nice program indeed

Options: Reply | Quote | Up ^


#9 Re: extract urls preceeded by a predefined string in a html file

Posted by: pspad | Date: 2014-02-19 12:20 | IP: IP Logged

simple put the [REF] on the begin of existing regular epression:
\[REF\]

backslash are there due to brackets are used as control chars in regular expressions

Options: Reply | Quote | Up ^


#10 Re: extract urls preceeded by a predefined string in a html file

Posted by: Andreas | Date: 2014-02-19 13:09 | IP: IP Logged

You can try with search and replace.
search:
\[Ref\](.*$)|.*
replace:
$1

$1 is the back reference to value in first parenthesis
| means OR
$ means END of line

But if you want the output cleaned up in a new file I think you have to write a script. Read the PSPad manual for scripting and use Javascript, therefor you get many help and tutorials.

Search for an online regex tester and a regex tutorial in your language.

Options: Reply | Quote | Up ^






Editor PSPad - freeware editor, © 2001 - 2024 Jan Fiala, Hosted by Webhosting TOJEONO.CZ, design by WebDesign PAY & SOFT, code Petr Dvořák, Privacy policy and GDPR