You are here: PSPad forum > English discussion forum > Re: BOMs for UTF-8 and UTF-16

Re: BOMs for UTF-8 and UTF-16

#1 BOMs for UTF-8 and UTF-16

Posted by: tmpad | Date: 2014-01-18 23:11 | IP: IP Logged

I searched the forum for BOM related issues but did not find a sufficient answer.

Config:
Default CP for files opening: Menu format settings
Ident bytes in UTF-8 coding: Checked.

Check Menu | Format | UTF-16 LE
Create a new file and the status line says UTF-16 LE. In Hex Edit Mode the BOM is "FF FE". OK.

Check Menu | Format | UTF-8
Create a new file and the status line says UTF-8. In Hex Edit Mode the BOM is also "FF FE", but the BOM for UTF-8 is "EF BB BF".

Isn't this a bug?

Options: Reply | Quote | Up ^


#2 Re: BOMs for UTF-8 and UTF-16

Posted by: pspad | Date: 2014-01-19 09:47 | IP: IP Logged

There are 2 HEX modes (se the help). If you open file as text and switch to hex mode, you will always see content in UTF-16LE - it's a image of memory.
It you want to see real file content, open file directly in HEX editor (right mouse in Windows explorer or menu File / Open in Hex editor)

Options: Reply | Quote | Up ^


#3 Re: BOMs for UTF-8 and UTF-16

Posted by: tmpad | Date: 2014-01-29 20:09 | IP: IP Logged

To be frank, it's not really clear from the help, that there are two different modes.

1) You say, that when I choose "View | Hex Edit MODE", I will see the content in UTF-16LE. When I do so with a UTF-8 encoded file, the status line still reads UTF-8, not UTF-16LE.

2) The hex code starts with a UT-16LE BOM. BOMs are meant for files, but not for memory images.

3) When I overwrite the BOM with a blank, code 2000, and switch back to the normal view, each charater is now followed by a blank.

I consider this as bugs.

Options: Reply | Quote | Up ^


#4 Re: BOMs for UTF-8 and UTF-16

Posted by: pspad | Date: 2014-01-30 05:51 | IP: IP Logged

I don't understand, what you want to do.
If you want to save file in UTF-8 without BOM, go to program settings / Program 2 and switch off include BOM in UTF-8

Options: Reply | Quote | Up ^


#5 Re: BOMs for UTF-8 and UTF-16

Posted by: tmpad | Date: 2014-01-30 17:42 | IP: IP Logged

I create a UTF-8 encoded text file with another program.
I load that file into pspad; the status line says UTF-8.
I switch to "View | Hex Edit MODE".
The status line still says UTF-8, but the hex display starts with a UTF-16 LE BOM.

In Hex Edit MODE, the status line must read "UTF-16 LE" and should not show a BOM.
BOMs are used to tell other programs about the encoding of text files. When showing a hex view of data in the memory, a BOM is obsolete. The program knows about the encoding.

Options: Reply | Quote | Up ^






Editor PSPad - freeware editor, © 2001 - 2024 Jan Fiala, Hosted by Webhosting TOJEONO.CZ, design by WebDesign PAY & SOFT, code Petr Dvořák, Privacy policy and GDPR