You are here: PSPad forum > Bug report / Hlášení chyb > Re: UTF-8 BOM only

Re: UTF-8 BOM only

#1 UTF-8 BOM only

Posted by: Vany | Date: 05/09/2017 15:29 | IP: IP Logged

If just and only the UTF-8 BOM mark is present in the file (it means it is precisely 3 bytes long), the PSPad editor doesn't detect it like UTF-8 file and displays these strange characters with default codepage even if the UTF-8 is chosen in the Format menu.

--
Vany
(PSPad unicode 5.0.0 (251), W7p x64 cs)

Edited 1 time(s). Last edit at 05/09/2017 15:33 by Vany.

Options: Reply | Quote | Up ^


#2 Re: UTF-8 BOM only

Posted by: pspad | Date: 05/09/2017 16:37 | IP: IP Logged

Use PSpad 5 from developer forum. It's near ready to be released and it has much better code page handling.

Options: Reply | Quote | Up ^


#3 Re: UTF-8 BOM only

Posted by: Vany | Date: 07/31/2017 18:39 | IP: IP Logged

I'm using version 5.0.0 (235), but still the same

--
Vany
(PSPad unicode 5.0.0 (251), W7p x64 cs)

Options: Reply | Quote | Up ^


#4 Re: UTF-8 BOM only

Posted by: pspad | Date: 08/01/2017 04:59 | IP: IP Logged

Hello

Do you want to say you have UTF-8 document with BOM and PSPad doesn't detect it as UTF-8?

It can be caused for only one reason. Your document isn't correct, it means it contains non UTF-8 chars in it. Decoding as UTF-8 will cause lost of non UTF-8 chars. It's typically export of the database.

Can you send me any example? Than smaller than better.

Options: Reply | Quote | Up ^


#5 Re: UTF-8 BOM only

Posted by: Vany | Date: 08/01/2017 09:13 | IP: IP Logged

As I wrote in the topmost message - if it is completely empty, but has the BOM present, so the file is 3 bytes in length, then it is not detected as an empty UTF-8 file, but PSPad opens it as standard text file with these bytes shown with ANSI Central European (1250) CP.

Code Page Autodetect is ON.

image

--
Vany
(PSPad unicode 5.0.0 (251), W7p x64 cs)

Edited 2 time(s). Last edit at 08/01/2017 09:20 by Vany.

Options: Reply | Quote | Up ^


#6 Re: UTF-8 BOM only

Posted by: pspad | Date: 08/09/2017 12:44 | IP: IP Logged

Empty file with UTF-8 encoding was special case.
I tested if file size > 0B and after loading editor is empty. In this case I open file as ANSI. I will handle files with BOM only.

Options: Reply | Quote | Up ^


#7 Re: UTF-8 BOM only

Posted by: Vany | Date: 09/05/2017 15:30 | IP: IP Logged

thanks for correction, from 241 it is okay now

--
Vany
(PSPad unicode 5.0.0 (251), W7p x64 cs)

Options: Reply | Quote | Up ^


#8 Re: UTF-8 BOM only

Posted by: Jarred | Date: 11/10/2017 08:00 | IP: IP Logged

I am using version 5.0.0 (251) and found bug when changing multiple files encoding.
If I choose utf-8 NO BOM, files are always converted to utf-8 WITH BOM.

Options: Reply | Quote | Up ^


#9 Re: UTF-8 BOM only

Posted by: pspad | Date: 11/10/2017 08:35 | IP: IP Logged

Hello, I will fix it

Options: Reply | Quote | Up ^






Editor PSPad - freeware editor, © 2001 - 2017 Jan Fiala
Hosted by Webhosting TOJEONO.CZ, design by WebDesign PAY & SOFT, code Petr Dvořák