You are here: PSPad forum > English discussion forum > Problem Extract URL

Problem Extract URL

Goto Page: Previous1 2 3 Next

#11 Re: Problem Extract URL

Posted by: maki | Date: 2020-03-15 13:34 | IP: IP Logged

IT DOES NOT MATCH multiple links

translate.google.pl

webcache.googleusercontent.com

Options: Reply | Quote | Up ^


#12 Re: Problem Extract URL

Posted by: pspad | Date: 2020-03-15 14:12 | IP: IP Logged

Same situation as before.
You sent some examples, you got answer.
Now you came that it doesn't match something totaly different.

Learn about regular examples, build your expression part by part to match your URLs.
First you need to uderstand, how it works. Without it, you will stay copy/paste guy only

Options: Reply | Quote | Up ^


#13 Re: Problem Extract URL

Posted by: maki | Date: 2020-03-15 14:18 | IP: IP Logged

This regex is good for MULTIPLE URLs, but it [b]still needs to be correct: "-" and "/"
number/number/number/number
OR
number/number/number-number

\.ru/([^/]+)/([^/]+)/(\-*\d+)|\.ru/avtor+/[a-zA-Z0-9-]+$

probably need to use

\b

The metacharacter \b is an anchor like the caret and the dollar sign. It matches at a position that is called a “word boundary”. This match is zero-length.

Edited 3 time(s). Last edit at 2020-03-15 14:21 by maki.

Options: Reply | Quote | Up ^


#14 Re: Problem Extract URL

Posted by: pspad | Date: 2020-03-15 14:21 | IP: IP Logged

\.ru/([^/]+)/([^/]+)/(\-*\d+)|\.ru/avtor+(/[a-zA-Z0-9-]+)+$

Options: Reply | Quote | Up ^


#15 Re: Problem Extract URL

Posted by: maki | Date: 2020-03-15 14:25 | IP: IP Logged

pspad:
\.ru/([^/]+)/([^/]+)/(\-*\d+)

Wrong Match!

Again!
Must Match
XXX/XXX/XXX-XXX
and
Must Match
XXX/XXX/XXX/XXX

Perl Regex

image

Edited 3 time(s). Last edit at 2020-03-15 14:27 by maki.

Options: Reply | Quote | Up ^


#16 Re: Problem Extract URL

Posted by: maki | Date: 2020-03-15 14:32 | IP: IP Logged

Invalid regex / Not extract url:

\.ru/([^/]+)/([^/]+)/(\-*\d+)|\.ru/avtor+(/[a-zA-Z0-9-]+)+$

image

shoulbe detect match:
.ru/avtor/1gG-

Ignore \s (url match)

Edited 3 time(s). Last edit at 2020-03-15 14:34 by maki.

Options: Reply | Quote | Up ^


#17 Re: Problem Extract URL

Posted by: pspad | Date: 2020-03-15 14:36 | IP: IP Logged

\.ru(/[a-zA-Z0-9-]+)+

Options: Reply | Quote | Up ^


#18 Re: Problem Extract URL

Posted by: maki | Date: 2020-03-15 14:49 | IP: IP Logged

pspad:
\.ru(/[a-zA-Z0-9-]+)+

Work.

But first regex still does not detect the full address

\.ru/([^/]+)/([^/]+)/(\d+\-*\d+)

.ru/2098/333/333/456

Edited 1 time(s). Last edit at 2020-03-15 14:49 by maki.

Options: Reply | Quote | Up ^


#19 Re: Problem Extract URL

Posted by: pspad | Date: 2020-03-15 14:54 | IP: IP Logged

So you have regular expression what detects first part correctly and you have expression what works correctly with the rest of address.
Put them together. But to do it you need to understand what you are doing.

Options: Reply | Quote | Up ^


#20 Re: Problem Extract URL

Posted by: maki | Date: 2020-03-15 15:25 | IP: IP Logged

The expression works, but it's messy!
Here practically "|" is unnecessary but I can't correct the last part to match for "XXX-XXX" and "XXX/XXX"

\.ru/([^/]+)/([^/]+)/([^/]+)/(\d+)|\.ru/([^/]+)/([^/]+)/(\d+-\d+)|\.ru/avtor(/[a-zA-Z0-9-]+)+

Edited 3 time(s). Last edit at 2020-03-15 15:29 by maki.

Options: Reply | Quote | Up ^


Goto Page: Previous1 2 3 Next





Editor PSPad - freeware editor, © 2001 - 2024 Jan Fiala, Hosted by Webhosting TOJEONO.CZ, design by WebDesign PAY & SOFT, code Petr Dvořák, Privacy policy and GDPR