You are here: PSPad forum > English discussion forum > Re: UTF-8

Re: UTF-8

#1 UTF-8

Posted by: MadCompie | Date: 2018-03-18 17:03 | IP: IP Logged

Hello, sometimes a file can't be detected as UTF-8 I don't know why. Even after forcing as UTF-8 & save. Firstly, I must disable autodetect and choose UTF8. When opening the file it's ok.
But when I close everything and enable autodetect + default codepage ansi, the same page can't be loaded as UTF-8.
Should it be possible to associate a file extension with a certain codepage?
Exemple: always open .xslt files in UTF-8 format. In that case, autodetection should be overruled.

Options: Reply | Quote | Up ^


#2 Re: UTF-8

Posted by: pspad | Date: 2018-03-18 18:51 | IP: IP Logged

Hello

UTF-8 encoded characters with 128 acscii value and above only. If there ano such characters, ANSI and UTF-8 are same and there is no possibility how to detect if file is ANSI or UTF-8 if there is no BOM included on the begin of file.

Options: Reply | Quote | Up ^


#3 Re: UTF-8

Posted by: pspad | Date: 2018-03-18 18:53 | IP: IP Logged

If you want to reload file in any other encoding, change encoding and reload file (Ctrl+R). There is no necessary to switch off autodetection.

Options: Reply | Quote | Up ^


#4 Re: UTF-8

Posted by: MadCompie | Date: 2018-03-21 12:01 | IP: IP Logged

What a great tip, thx!
Yes indeed that does the job.
But I wonder why some files (all with NO-BOM) open as UTF-8 and other still not...
This even after doing the conversion + save as UTF-8 NO BOM.
Anyway, thanks to your tip it's much easier to go on!

Options: Reply | Quote | Up ^


#5 Re: UTF-8

Posted by: pspad | Date: 2018-03-21 12:08 | IP: IP Logged

If file doesn't contains UTF-8 encoded chars (chars with ASCII value > 127), there is no difference between ANSI and UTF-8 in file content. PSpad can't recognize it.

Options: Reply | Quote | Up ^


#6 Re: UTF-8

Posted by: Andreas | Date: 2018-03-21 22:42 | IP: IP Logged

I recommend to set the default code page to UTF8 no bom to be on the safe side. What did you need Ansi for?

Options: Reply | Quote | Up ^


#7 Re: UTF-8

Posted by: hhoefling | Date: 2018-03-22 10:20 | IP: IP Logged

Me to smiling smiley

Or...
allways use any char>128 in the first lines.
Maybee in an comment on top

--
by HH

Options: Reply | Quote | Up ^


#8 Re: UTF-8

Posted by: MadCompie | Date: 2018-03-28 11:03 | IP: IP Logged

hhoefling:
Me to smiling smiley

Or...
allways use any char>128 in the first lines.
Maybee in an comment on top

That is exactly what I do... at the beginning of the file:
<!-- this is a dummy char é -->

After, it will be recognized as UTF-8.
But without that, AND WITH some char é in the middle of the file, it will not be
recognized.
So I think that PSPad does not search the whole document for special chars, probably for speed reasons?

Options: Reply | Quote | Up ^


#9 Re: UTF-8

Posted by: pspad | Date: 2018-03-28 11:07 | IP: IP Logged

PSPad search only forst about 20 000 chars. Not whole milions of chars document.

Options: Reply | Quote | Up ^


#10 Re: UTF-8

Posted by: MadCompie | Date: 2018-03-29 17:43 | IP: IP Logged

pspad:
PSPad search only forst about 20 000 chars. Not whole milions of chars document.

I thought so, no problem... for me I can solve this problem by adding some dummy text with a UTF-8 char at the beginning of the file!
Thx for clearing this up!

Options: Reply | Quote | Up ^






Editor PSPad - freeware editor, © 2001 - 2024 Jan Fiala, Hosted by Webhosting TOJEONO.CZ, design by WebDesign PAY & SOFT, code Petr Dvořák, Privacy policy and GDPR