You are here: PSPad forum > English discussion forum > Re: I am looking for a special pattern "Bible Verse Regex"

Re: I am looking for a special pattern "Bible Verse Regex"

Goto Page: Previous1 2 3 Next

#11 Re: I am looking for a special pattern "Bible Verse Regex"

Posted by: Haunebu | Date: 2021-04-10 08:17 | IP: IP Logged

Please show the screenshot that all characters are matched fully in Application of the PSPad. Because even without text, the regular expression is incorrect.

Options: Reply | Quote | Up ^


#12 Re: I am looking for a special pattern "Bible Verse Regex"

Posted by: pspad | Date: 2021-04-10 08:42 | IP: IP Logged

Provide more complex text examples please with some text arround and mark what do you want to find.

Options: Reply | Quote | Up ^


#13 Re: I am looking for a special pattern "Bible Verse Regex"

Posted by: Haunebu | Date: 2021-04-10 10:45 | IP: IP Logged

All characters should be matched
[():,;-. ]

(Rdz 1:9, 10, 13; Neh 9:6; Dz 4:24; 14:15; Obj 14:7)

Options: Reply | Quote | Up ^


#14 Re: I am looking for a special pattern "Bible Verse Regex"

Posted by: pspad | Date: 2021-04-10 11:10 | IP: IP Logged

Read again and slowly my previous answer.

Options: Reply | Quote | Up ^


#15 Re: I am looking for a special pattern "Bible Verse Regex"

Posted by: Haunebu | Date: 2021-04-10 13:02 | IP: IP Logged

It's better not to explain. As you can see the pattern to match only the characters in red.
It should match a total of 52 characters
image

Edited 2 time(s). Last edit at 2021-04-10 13:04 by Haunebu.

Options: Reply | Quote | Up ^


#16 Re: I am looking for a special pattern "Bible Verse Regex"

Posted by: Haunebu | Date: 2021-04-10 15:57 | IP: IP Logged

Invalid Bible Verse Regex
([a-žA-Ž0-9.]+\s+)+([0-9:.,;-]+\s{0,})+[0-9]

image

Edited 1 time(s). Last edit at 2021-04-10 15:58 by Haunebu.

Options: Reply | Quote | Up ^


#17 Re: I am looking for a special pattern "Bible Verse Regex"

Posted by: vbr | Date: 2021-04-10 17:56 | IP: IP Logged

Haunebu:
Invalid Bible Verse Regex
([a-žA-Ž0-9.]+\s+)+([0-9:.,;-]+\s{0,})+[0-9]
...

Hi, if you want to avoid such "false positive" matches, which are not meaningful for you, the individual book names as well as abbreviations and variants will be needed in such a regex pattern.
Also, matching multiple citations to different books or places in one match is rather difficult or error prone with regex.
A possibility could be, to match the general format of the citation, if there is one - e.g. parens, colons, numbers in specified expected positions.

Note, that the accented characters are not treated alphabetically here, but most likely as codepoints in unicode, i.e.
[a-žA-Ž]
will match e.g. the following characters:
A...Z [ \ ] ^ _ ` a...z { | } ~ ¡ ¢ £ ¤ ¥ ¦ § ¨ © ª « ¬ ­ ® ¯ ° ± ² ³ ´ µ ¶ · ¸ ¹ º » ¼ ½ ¾ ¿ À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï Ð Ñ Ò Ó Ô Õ Ö × Ø Ù Ú Û Ü Ý Þ ß à á â ã ä å æ ç è é ê ë ì í î ï ð ñ ò ó ô õ ö ÷ ø ù ú û ü ý þ ÿ Ā ā Ă ă Ą ą Ć ć Ĉ ĉ Ċ ċ Č č Ď ď Đ đ Ē ē Ĕ ĕ Ė ė Ę ę Ě ě Ĝ ĝ Ğ ğ Ġ ġ Ģ ģ Ĥ ĥ Ħ ħ Ĩ ĩ Ī ī Ĭ ĭ Į į İ ı IJ ij Ĵ ĵ Ķ ķ ĸ Ĺ ĺ Ļ ļ Ľ ľ Ŀ ŀ Ł ł Ń ń Ņ ņ Ň ň ʼn Ŋ ŋ Ō ō Ŏ ŏ Ő ő Œ œ Ŕ ŕ Ŗ ŗ Ř ř Ś ś Ŝ ŝ Ş ş Š š Ţ ţ Ť ť Ŧ ŧ Ũ ũ Ū ū Ŭ ŭ Ů ů Ű ű Ų ų Ŵ ŵ Ŷ ŷ Ÿ Ź ź Ż ż Ž ž ſ Μ

Mostly, it catches many basic and accented Latin characters, but also some non-characters which happen to have codepoints between A and ž, and there are some other specificities as well.

I'd suggest to use some rather general pattern based on the citation format (if possible without false negatives, i.e. missing useful matches) and work with the false (positive) matches in further steps.

hth,
vbr

Edited 1 time(s). Last edit at 2021-04-10 17:57 by vbr.

Options: Reply | Quote | Up ^


#18 Re: I am looking for a special pattern "Bible Verse Regex"

Posted by: Haunebu | Date: 2021-04-11 11:36 | IP: IP Logged

vbr. Completely unnecessary ALL UNICODE!

No, it was PSPad (administrator) that mistakenly suggested to use [A-Z] (unicode), which is not recommended here!
This is a totally wrong pattern. And it doesn't matter whatever language is used.

This is the best pattern, but requires several patches for a combination of multinumbers, multispaces, multisemicolons REPEAT and brackets
(;)s0-9
(?:\d|I{1,3})?\s?\w{2,}\.?\s*\d{1,}\:\d{1,}-?,?\d{0,2}(?:,\d{0,2}){0,2}

Edited 5 time(s). Last edit at 2021-04-11 11:44 by Haunebu.

Options: Reply | Quote | Up ^


#19 Re: I am looking for a special pattern "Bible Verse Regex"

Posted by: pspad | Date: 2021-04-11 12:54 | IP: IP Logged

Sorry, I tried to help you. My solution doesn't fit to you, it's OK for me.
You have as much time as you need to tune up your expression.
But in my opinion, your search pattern are too different. It will be very hard to use one regular expression to match all form you reguested.

Options: Reply | Quote | Up ^


#20 Re: I am looking for a special pattern "Bible Verse Regex"

Posted by: Haunebu | Date: 2021-04-11 15:23 | IP: IP Logged

Be kind to show me a screenshot of the PSPad application where you matched my verses from the first post to your pattern.

Options: Reply | Quote | Up ^


Goto Page: Previous1 2 3 Next





Editor PSPad - freeware editor, © 2001 - 2024 Jan Fiala, Hosted by Webhosting TOJEONO.CZ, design by WebDesign PAY & SOFT, code Petr Dvořák, Privacy policy and GDPR