You are here: PSPad forum > English discussion forum > How to combine several regular expressions
Goto Page: Previous1 2 3 4 5 Next
Posted by: pspad | Date: 2017-11-01 10:56 | IP: IP Logged
pspad:Just tested. When I save it as ANSI 1251, PSPad isn't able to detect correct encoding cause there isn't enough cyrilic characters.
In this case I change encoding in menu Encoding to Ansi 1251 and reload (Ctrl+R) to see correct contentWhen I save your sample to UTF-8 no BOM and reopen it. PSpad open it correctly as UTF-8 file.
I made some modification in autodetect algorithm and now I am PSPad detects your sample (even there is few letters only) correctly as ANSI 1251
Posted by: pspad | Date: 2017-11-01 11:10 | IP: IP Logged
MAKI, can you tell me please how will you do encoding autodetection? ANSI files doesn't contains any code page information.
Give me any reasonable suggestion what will lead to better autodetection and I will do it.
Posted by: maki | Date: 2017-11-01 11:12 | IP: IP Logged
pspad:If you want to search for '\' char, you need to escape it with '\' char. It means you will use \\, not a \\\
Now, I'm using \\
Nothing has been extracted, so it's still an invalid regex.
Posted by: pspad | Date: 2017-11-01 11:16 | IP: IP Logged
Simple question. Is there any reason why don't you use remove tag function from the HTML menu? It doesn't work for you?
Posted by: pspad | Date: 2017-11-01 11:18 | IP: IP Logged
You wrote example earlier:
<br>А __<br>Так же ", \"сонный\", \" убитый\". Глупоо человек доверяет мне.<br>
What should be result? Can you wrote result text?
Posted by: maki | Date: 2017-11-01 11:18 | IP: IP Logged
Disable detect HTML/XML CharsetMy settings from another text editor, properly detect virtually any encoding system.
Edited 1 time(s). Last edit at 2017-11-01 11:20 by maki.
Posted by: pspad | Date: 2017-11-01 11:22 | IP: IP Logged
What I can say is that PSPad in my computer was able to detect UTF-8 (without BOM) from your sample correctly.
There is possible that your text contains not only UTF-8 but mixed ANSI and UTF characters. In this case UTF-8 encoding can't be used, cause it will cause damage of ansi characters.
Posted by: maki | Date: 2017-11-01 11:24 | IP: IP Logged
pspad:You wrote example earlier:<br>А __<br>Так же ", \"сонный\", \" убитый\". Глупоо человек доверяет мне.<br>
What should be result? Can you wrote result text?
code remove
<br>А __<br>Так же ", \"сонный\", \" убитый\". Глупоо человек доверяет мне.<br>
code remove
It should only extract the text + tag <br> or >/br>
<br>А __<br>Так же ", \"сонный\", \" убитый\". Глупоо человек доверяет мне.<br>
Posted by: pspad | Date: 2017-11-01 11:27 | IP: IP Logged
maki:
<br>А __<br>Так же ", \"сонный\", \" убитый\". Глупоо человек доверяет мне.<br>It should only extract the text + tag <br> or >/br>
<br>А __<br>Так же ", \"сонный\", \" убитый\". Глупоо человек доверяет мне.<br>
Sorry, but it looks like you make fun of me. Both lines are same. Can you write text after using regular expression? Result what do you want to get.
Posted by: maki | Date: 2017-11-01 11:32 | IP: IP Logged
The header
charset=windows-1251
means Cyrillic. You will have to change this to
charset=UTF-8
if you want to display the HTML in UTF-8 encodings.
The solution is:
How to Disable in PSPad ?
Detect HTML/XML Charset
Editor PSPad - freeware editor, © 2001 - 2024 Jan Fiala, Hosted by Webhosting TOJEONO.CZ, design by WebDesign PAY & SOFT, code Petr Dvořák, Privacy policy and GDPR