You are here: PSPad forum > English discussion forum > Problem Extract URL

Problem Extract URL

Goto Page: 1 2 3 Next

#1 Problem Extract URL

Posted by: maki | Date: 2020-03-15 11:28 | IP: IP Logged

Problem Extract URL

^https?:\/\/(?www.)website\.ru/(\d+)/(\d+)/(\d+\-*)|avtor/+[A-Za-z0-9-]$

Must match:

website
www.website
website:
website:
website-
www.website

Edited 1 time(s). Last edit at 2020-03-15 11:28 by maki.

Options: Reply | Quote | Up ^


#2 Re: Problem Extract URL

Posted by: pspad | Date: 2020-03-15 11:56 | IP: IP Logged

It won't match http, cause in your expression is https

Options: Reply | Quote | Up ^


#3 Re: Problem Extract URL

Posted by: maki | Date: 2020-03-15 12:11 | IP: IP Logged

Ok but even without a protocol it doesn't work properly. Another part is the problem

Edited 1 time(s). Last edit at 2020-03-15 12:12 by maki.

Options: Reply | Quote | Up ^


#4 Re: Problem Extract URL

Posted by: pspad | Date: 2020-03-15 12:26 | IP: IP Logged

You must include both protocols

Options: Reply | Quote | Up ^


#5 Re: Problem Extract URL

Posted by: maki | Date: 2020-03-15 12:31 | IP: IP Logged

protocol ???

(https|http)|www\d{0,3}[.]

problem REGEX "(\-*)" Match:

XXX/XXX/XXX/XXX

XXX/XXX/XXX-XXX

X = NUMBER

Edited 2 time(s). Last edit at 2020-03-15 12:34 by maki.

Options: Reply | Quote | Up ^


#6 Re: Problem Extract URL

Posted by: pspad | Date: 2020-03-15 12:35 | IP: IP Logged

It's not problem of regular expressions, but ptoblem of your expression.
I thought you must be expert already, but you have stil problems with reading your expression.
If you want to use any "tool", you need to understand it, not only copy/paste something and argue it doesn't work.

Options: Reply | Quote | Up ^


#7 Re: Problem Extract URL

Posted by: maki | Date: 2020-03-15 12:40 | IP: IP Logged

The specific separation of valid addresses in this case will not work, because it will simply still extract with additional text.
Multiple links in a line are also invalid.
It doesn't need many domains.

Writing expressions with "|" it works, but it is unreasonable - I would like to simplify the expression as a single joint syntax.

I really don't need a protocol, just \.ru

Edited 5 time(s). Last edit at 2020-03-15 12:47 by maki.

Options: Reply | Quote | Up ^


#8 Re: Problem Extract URL

Posted by: pspad | Date: 2020-03-15 12:48 | IP: IP Logged

What about this one:
^(http|https):\/\/[\.\w\-_\:]+(\/[\w\-_]+)+

It matches all your examples.
PS. There is no necessary to excape / char, so for better readability you can use:
^(http|https)://[\.\w\-_\:]+(/[\w\-_]+)+

Edited 1 time(s). Last edit at 2020-03-15 12:49 by pspad.

Options: Reply | Quote | Up ^


#9 Re: Problem Extract URL

Posted by: maki | Date: 2020-03-15 12:50 | IP: IP Logged

Next part ???

\.ru/([^/]+)/([^/]+)/([^/]+)/(\d+-*\d+)

Options: Reply | Quote | Up ^


#10 Re: Problem Extract URL

Posted by: pspad | Date: 2020-03-15 13:05 | IP: IP Logged

Why next part? All examples you provided match regular expression I sent in my previous answer.
But I guess you have other and other URLs what are totaly different from examples as always, right?

Options: Reply | Quote | Up ^


Goto Page: 1 2 3 Next





Editor PSPad - freeware editor, © 2001 - 2024 Jan Fiala, Hosted by Webhosting TOJEONO.CZ, design by WebDesign PAY & SOFT, code Petr Dvořák, Privacy policy and GDPR