You are here: PSPad forum > Bug report / Hlášení chyb > Re: Bug with XML reformat

Re: Bug with XML reformat

#1 Bug with XML reformat

Posted by: amj0306 | Date: 2015-05-04 12:14 | IP: IP Logged

The feature "HTML"->"Reformat HTML Code"
has a bug where it removes leading and trailing spaces from values in the
xml source. The reformat should not change the values in the tags, it should
just reformat the xml structure.

To reproduce
1. save the following as "Test.xml"
<?xml version="1.0" encoding="UTF-8"?>
<TestStructure>
<leadingSpace> leading_space</leadingSpace>
<twoLeadingSpace> leading_space</twoLeadingSpace>
<trailingSpace>trailing_space </leadingSpace>
</TestStructure>

NOTE:
1. the first tag <leadingSpace> has a value with a leading space
2. the second tag <twoLeadingSpace> has a value with 2 leading spaces
3. the third tag <trailingSpace> has a value with a trailing space

2. open the file in PsPad and use
"View" -> "Change Syntax" -> "XML"
3. select "HTML" -> "Compress HTML Code"
* this works, the spaces are preserved:
<?xml version="1.0" encoding="UTF-8"?><TestStructure><leadingSpace> leading_space</leadingSpace><twoLeadingSpace> leading_space</twoLeadingSpace><trailingSpace>trailing_space </leadingSpace></TestStructure>

4. select "HTML" -> "Reformat HTML Code"
* This is the problem,
The reformat changes the values in the tags by removing spaces:

<?xml version="1.0" encoding="UTF-8"?>
<TestStructure>
<leadingSpace>leading_space</leadingSpace>
<twoLeadingSpace>leading_space</twoLeadingSpace>
<trailingSpace>trailing_space</leadingSpace>
</TestStructure>

Options: Reply | Quote | Up ^


#2 Re: Bug with XML reformat

Posted by: Andreas | Date: 2015-05-04 17:54 | IP: IP Logged

I think it only should preserve whitespace when the node is set toxml:space="preserve"
www.w3.org -> www.w3.org/TR/2008/REC-xml-20081126/#sec-white-space

Options: Reply | Quote | Up ^


#3 Re: Bug with XML reformat

Posted by: pspad | Date: 2015-05-04 19:28 | IP: IP Logged

How is it handled with TiDy with XML reformat option? It's available in the HTML menu on the bottom.

Options: Reply | Quote | Up ^


#4 Re: Bug with XML reformat

Posted by: Andreas | Date: 2015-05-05 00:41 | IP: IP Logged

Tidy respects xml:space="preserve" but enters newline on nodes with this attribute.

before:

<?xml version="1.0" encoding="UTF-8"?>
<test>
<test1 xml:space="preserve"> three spaces leading between trailing </test1>
<test2> three spaces leading between trailing </test2>
</test>

after TiDy XML reformat:

<?xml version="1.0" encoding="utf-8"?>
<test>
<test1 xml:space="preserve">
three spaces leading between trailing
</test1>
<test2>three spaces leading between trailing</test2>
</test>

Options: Reply | Quote | Up ^


#5 Re: Bug with XML reformat

Posted by: amj0306 | Date: 2015-05-05 12:17 | IP: IP Logged

In my case the whitespace is part of the value of the data element.
and not just insignificant whitespace between markup tags.

(I ran into this issue when a transaction failed because the receiving application did not expect leading spaces.
I was using PSPad to reformat the log files to beautify them for readability and did not see the leading spaces. When the receiving application insisted that the leading spaces were there, I went back to the logs and noticed that leading spaces were present *before* PSPad beautified the text but not after.)

NOTE:

I have tried other xml reformat tools and they preserve the whitespace in the data elements, which is why i raised this issue.

I loaded the following:
<?xml version="1.0" encoding="UTF-8"?><TestStructure><leadingSpace> leading_space</leadingSpace><twoLeadingSpace> leading_space</twoLeadingSpace><trailingSpace>trailing_space </trailingSpace></TestStructure>

Into the following tools:

I tried First Object XML editor application (a freeware xml editor)
www.firstobject.com .. preserves whitespace

I also tried several online tools:
www.webtoolkitonline.com .. preserves whitespace
xmltoolbox.appspot.com .. preserves whitespace
codebeautify.org .. preserves whitespace
xmlbeautifier.com .. preserves whitespace

Options: Reply | Quote | Up ^


#6 Re: Bug with XML reformat

Posted by: Andreas | Date: 2015-05-05 19:05 | IP: IP Logged

I'm not good in reading specifications but I think it's the good way to follow specifications.
www.w3.org -> www.w3.org/TR/xml/#sec-white-space
www.w3.org -> www.w3.org/TR/xml11/#sec-white-space

I think it's ok to keep whitespaces after formating. Although I think amj0306 should review his project as it is no good practice to deal with text nodes containing leading, trailing or successive blanks.

Options: Reply | Quote | Up ^


#7 Re: Bug with XML reformat

Posted by: carbonize | Date: 2015-05-07 14:54 | IP: IP Logged

You are telling it to treat the XML as HTML and HTML does not recognise white space except between characters this is why we have &nbsp;

--
Carbonize

Options: Reply | Quote | Up ^






Editor PSPad - freeware editor, © 2001 - 2024 Jan Fiala, Hosted by Webhosting TOJEONO.CZ, design by WebDesign PAY & SOFT, code Petr Dvořák, Privacy policy and GDPR