Posted by: Extranjero | Date: 2021-08-21 17:41 | IP: IP Logged
I've an archive (maybe more archives ...) containing strings like "¶¬" or "µ¬".
I'm using this 'strange characters' combination as separators/ids in 'custom dictionaries' because in 'real life' these combinations rather exist, and when they are eliminated in keys and values before creating the dictionary ...
Opening the file with some text-editor, all ok.
Opening the file with PsPad, sometimes, ok but mostly popping up a warning window saying file 'contains broken UTF-8 encoding. Do you want to open it?'
If u confirm the warning window with NO, pops up an error message, saying file cannot be opened.
In this case PsPad looses the ability to access the file from the left project file list (clicking it) until PsPad is restarted.
If PsPad is restarted generally the atempt to open the file, terminates in popping up the warning window saying file 'contains broken UTF-8 encoding' ...
Selecting from the file-explorer-window on the left 'open with Notepad' works fine, 'open with PsPad' ends up with the warning window saying file 'contains broken UTF-8 encoding' ...
If u confirm the warning window 'contains broken UTF-8 ..' with YES, File Encoding changes from ANSI Western European (1552) to Unicode UTF-8 no BOM (65001), but in the Encoding menu still remains selected 'ANSI Western European (1552)'. The characters of the 'rare' strings are converted in �, putting back the file encoding to ANSI Western European (1552) by clicking the encoding information on the bottom of the PsPad program window, saving the file with ANSI Western European (1552) encoding, results that on reopening the archive, it stays in ANSI Western European (1552) and the � signs are converted in standard ? signs.
Clicking on file-info and statistics throws an
Access violation at address 00007FFB1100A525 in module 'KERNELBASE.dll'. Read of address 000000000000073C.
... which can be closed without PsPad to crash and opening the window with the file-info: Words count zero, lines count 47 (see image)
Windows 10 Pro 64bit, PsPad(64) 5.0.7 (681) used from extern disk (crete project from folder on start) same effects downgrading to 649 ... but in times when 649 was latest version it didn't happen.
BTW: reopening such a lot of times PsPad I noticed that selecting 'Create Project from Directory' the 'Project Default Directory' search window on each new search opens a little bit more down-right ...
And I remember the times when the Program Default Directory there was scrolled in the view, and in Dark Theme the search window is bright ... but these are not important details.
Posted by: pspad | Date: 2021-08-22 18:08 | IP: IP Logged
Don't open it as UTF-8, but open is as ANSI. In this case PSPad opens your file as Notepad.
Mixing UTF-8 encodig with ANSI encoding isn't good idea.
PSPad detects UTF-8 encoding automatically but can't open it directly as UTF-8, cause there are non UTF-8 encoded chars. Version 5.0.7 will ask you and you can decide, if you want to open it as UTF-8 or not.
Posted by: Tom-Trottier | Date: 2022-08-30 21:28 | IP: IP Logged
I have this problem with one file. 5.07(731) does not give me any choice. Previous versions of the file open with UTF-8, but lines are all run together into a 2MB line.
Posted by: Tom-Trottier | Date: 2022-08-30 21:29 | IP: IP Logged
Could there be a size problem?
Posted by: pspad | Date: 2022-08-31 01:47 | IP: IP Logged
Hello. If there is a mix of UTF-8 with non UTF-8 chars, pspad should ask you, if you want to open it as utf-8.
Selecting UTF-8 and reopening should force PSPad open your file with UTF-8 encoding.
PSPad (editor component) has problem with long lines. It slows it down
Posted by: Tom-Trottier | Date: 2022-09-01 21:08 | IP: IP Logged
I was not given the choice. It's odd it just happened. Even the backup file has the same problem. Do you have any problems with a 2MB file?
Posted by: Tom-Trottier | Date: 2022-09-01 21:41 | IP: IP Logged
Notepad++ has no problems with it. It showed "LS" (line separator) control character instead of CRLF(carriage-return, line feed)
After changing LS to CRLF, I saved in Notepad++. Opening in PSpad still said illegal utf8 characters, open or quit.
I suggest giving the option to replace illegal utf8 characters with some special character/sequence, then scroll to the first instance so the user can see where the problem is and can fix it in context. You might want to give a count or illegal utf8 at the start in case there are hundreds.
Posted by: Tom-Trottier | Date: 2022-09-01 21:47 | IP: IP Logged
pspad:Hello. If there is a mix of UTF-8 with non UTF-8 chars, pspad should ask you, if you want to open it as utf-8.
Selecting UTF-8 and reopening should force PSPad open your file with UTF-8 encoding. ...
When I opened it as utf8, all the utf8 characters turned into 1 or more question marks.
Posted by: Tom-Trottier | Date: 2022-09-02 20:24 | IP: IP Logged
I fixed my problem by copying from notepad++ to pspad - now everything is good, including the lines.