You are here: PSPad forum > English discussion forum > Problem Extract URL
Posted by: pspad | Date: 2020-03-15 14:12 | IP: IP Logged
Same situation as before.
You sent some examples, you got answer.
Now you came that it doesn't match something totaly different.
Learn about regular examples, build your expression part by part to match your URLs.
First you need to uderstand, how it works. Without it, you will stay copy/paste guy only
Posted by: maki | Date: 2020-03-15 14:18 | IP: IP Logged
This regex is good for MULTIPLE URLs, but it [b]still needs to be correct: "-" and "/"
number/number/number/number
OR
number/number/number-number
\.ru/([^/]+)/([^/]+)/(\-*\d+)|\.ru/avtor+/[a-zA-Z0-9-]+$
probably need to use
\b
The metacharacter \b is an anchor like the caret and the dollar sign. It matches at a position that is called a “word boundary”. This match is zero-length.
Edited 3 time(s). Last edit at 2020-03-15 14:21 by maki.
Posted by: pspad | Date: 2020-03-15 14:21 | IP: IP Logged
\.ru/([^/]+)/([^/]+)/(\-*\d+)|\.ru/avtor+(/[a-zA-Z0-9-]+)+$
Posted by: maki | Date: 2020-03-15 14:25 | IP: IP Logged
pspad:\.ru/([^/]+)/([^/]+)/(\-*\d+)
Wrong Match!
Again!
Must Match
XXX/XXX/XXX-XXX
and
Must Match
XXX/XXX/XXX/XXX
Perl Regex
Edited 3 time(s). Last edit at 2020-03-15 14:27 by maki.
Posted by: maki | Date: 2020-03-15 14:32 | IP: IP Logged
Invalid regex / Not extract url:
\.ru/([^/]+)/([^/]+)/(\-*\d+)|\.ru/avtor+(/[a-zA-Z0-9-]+)+$
shoulbe detect match:
.ru/avtor/1gG-
Ignore \s (url match)
Edited 3 time(s). Last edit at 2020-03-15 14:34 by maki.
Posted by: pspad | Date: 2020-03-15 14:36 | IP: IP Logged
\.ru(/[a-zA-Z0-9-]+)+
Posted by: maki | Date: 2020-03-15 14:49 | IP: IP Logged
pspad:\.ru(/[a-zA-Z0-9-]+)+
Work.
But first regex still does not detect the full address
\.ru/([^/]+)/([^/]+)/(\d+\-*\d+)
.ru/2098/333/333/456
Edited 1 time(s). Last edit at 2020-03-15 14:49 by maki.
Posted by: pspad | Date: 2020-03-15 14:54 | IP: IP Logged
So you have regular expression what detects first part correctly and you have expression what works correctly with the rest of address.
Put them together. But to do it you need to understand what you are doing.
Posted by: maki | Date: 2020-03-15 15:25 | IP: IP Logged
The expression works, but it's messy!
Here practically "|" is unnecessary but I can't correct the last part to match for "XXX-XXX" and "XXX/XXX"
\.ru/([^/]+)/([^/]+)/([^/]+)/(\d+)|\.ru/([^/]+)/([^/]+)/(\d+-\d+)|\.ru/avtor(/[a-zA-Z0-9-]+)+
Edited 3 time(s). Last edit at 2020-03-15 15:29 by maki.
Editor PSPad - freeware editor, © 2001 - 2025 Jan Fiala, Hosted by Webhosting TOJEONO.CZ, design by WebDesign PAY & SOFT, code Petr Dvořák, Privacy policy and GDPR