You are here: PSPad forum > English discussion forum > div class tag regex issue

div class tag regex issue

Goto Page: 1 2 3 Next

#1 div class tag regex issue

Posted by: maki | Date: 2020-01-27 15:46 | IP: IP Logged

<div class="wall_post_text">Музыка...т жизни: Ничего Невозможного Нет. Люблю каждого, каждому дарю частичку света и радости. smiling smiley<br><br>Предпочитаю слова и предложения на русском литературном языке ресных людей, с кем можно говорить, говорить обо всемsmiling smiley<br><br>Кремён, СПб, Почтамт до востребования, 190000<br><br>жду:*</div>

How to extract it as text, with separation.

Музыка...т жизни: Ничего Невозможного Нет. Люблю каждого, каждому дарю частичку света и радости. smiling smiley
Предпочитаю слова и предложения на русском литературном языке ресных людей, с кем можно говорить, говорить обо всемsmiling smiley
Кремён, СПб, Почтамт до востребования, 190000
жду:*

My expression is not perfect. Please help!

(?<=div class="wall_post_text">).[^\[\]\{\}]*?(?=<br>)|(?<=br>).[^\[\]\{\}]*?(?=<br>)

Edited 3 time(s). Last edit at 2020-01-27 15:53 by maki.

Options: Reply | Quote | Up ^


#2 Re: div class tag regex issue

Posted by: Vany | Date: 2020-01-27 16:24 | IP: IP Logged

What about to replace <br><br> with \n at first step?

--
Vany
(PSPad 5.5.1.812 x32, W10h/p x64 en/cs)

Options: Reply | Quote | Up ^


#3 Re: div class tag regex issue

Posted by: pspad | Date: 2020-01-27 16:32 | IP: IP Logged

Maki, you are asking the same again and again.
Last time you open your HTML in browser, copy text from it and you were happy. Do it again, you will spare your time, you will spare our time and you will be happy again.

Options: Reply | Quote | Up ^


#4 Re: div class tag regex issue

Posted by: maki | Date: 2020-01-27 16:41 | IP: IP Logged

PSPad - The first issue is fully resolved
HTML to TXT

Now I have a second point. Much harder. Another 1GB file! 1000.000.000 character
Java Log HTML to HTML
It is not easy to convert. Very Complex code.

Vany not work for me.

Edited 2 time(s). Last edit at 2020-01-27 16:44 by maki.

Options: Reply | Quote | Up ^


#5 Re: div class tag regex issue

Posted by: maki | Date: 2020-01-27 17:26 | IP: IP Logged

I want extract text:

<div class="wall_post_text">extract text</div>

image

Edited 2 time(s). Last edit at 2020-01-27 17:28 by maki.

Options: Reply | Quote | Up ^


#6 Re: div class tag regex issue

Posted by: maki | Date: 2020-01-27 17:52 | IP: IP Logged

still wrong regex:

<a\s+(?:[^>]*?\s+)?href=\\"\\/wall-(\d+)\?q=.*?<\\/span>

OR

<a\s+(?:[^>]*?\s+)?href=\\"\\/wall\-(\d+)\?q\=.+<\\/span>

Edited 1 time(s). Last edit at 2020-01-27 17:59 by maki.

Options: Reply | Quote | Up ^


#7 Re: div class tag regex issue

Posted by: pspad | Date: 2020-01-27 18:35 | IP: IP Logged

If you want to extract your lines, use search dialog, search for regular expression:
<div class="wall_post_text">(.*)</div>
and use COPY button
It will copy only lines with content you are looking for

Second step with result:
Search: <div class="wall_post_text">(.*)</div>
Replace: $1

Options: Reply | Quote | Up ^


#8 Re: div class tag regex issue

Posted by: maki | Date: 2020-01-27 18:50 | IP: IP Logged

[Window Title]
Info

[Content]
Occurrence of "<div class="wall_post_text">(.*)</div>" was found 0 times

[OK]

Unfortunately it doesn't work. The tags are completely different in the log.

Example:

<div class=\"wall_post_text\"><a href=\"\/feed?section=search&q=%23%D0%91%D0%9F_%D0%A0%D0%BE%D1%81%D1%81%D0%B8%D1%8F\">#БП_Россия<\/a><br><br>Привет&#33;<br>Меня деле.<br>Больше всего я люблю учиться,путетвовать и учить языки. В собеседнике ищу желание переписываться и только. Мы вместе учить языки,рассказывать о своих будх, дискутировать на проблемные темы и все что ты захочешь.<br>Адрес дам в лс.<\/div>

Options: Reply | Quote | Up ^


#9 Re: div class tag regex issue

Posted by: pspad | Date: 2020-01-27 18:57 | IP: IP Logged

Why do you send example different from your real text?

The regular expression for your current text is:
<div class=\\"wall_post_text\\">(.*)<\\/div>

If text you have sent is different, modify regular expression

Options: Reply | Quote | Up ^


#10 Re: div class tag regex issue

Posted by: maki | Date: 2020-01-27 19:02 | IP: IP Logged

Because the text contains private data. That's why I limited this possibility. If you want a real text then I will send by e-mail etc.

Options: Reply | Quote | Up ^


Goto Page: 1 2 3 Next





Editor PSPad - freeware editor, © 2001 - 2024 Jan Fiala, Hosted by Webhosting TOJEONO.CZ, design by WebDesign PAY & SOFT, code Petr Dvořák, Privacy policy and GDPR