-
Notifications
You must be signed in to change notification settings - Fork 737
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xmldocument::save uses non-transformed encoding parameter #617
Comments
What user visible consequences does this have without the change to add utf8? Why should the default behavior be to emit encoding=utf-8 for every document? That’s already what the XML specification says should be the default. |
As I said in the pull request (forgot to mention it explicitly here): it's a minor bugfix - in the current implementation, I don't see a problem - however, if That's why it's a bug without consequences in the current implementation, but it's still a bug as far as I can tell.
Well I suppose should is not the right word - it's just a matter of preference in my case - when working with XLSX documents created in e.g. LibreOffice, the encoding=utf-8 gets stripped when saving a modified file with pugixml. I just thought that this initial xml tag has basically zero impact on performance, so explicit encoding is better than implicit. It's your project - leave that change out if you don't like it :) |
Ok - I don't know that it matters either way, but feel free to submit a PR that changes the encoding check to buffered_writer.encoding. I do not think the default output should be changed. Stylistic preferences can be accommodated using a documented method in this case: https://pugixml.org/docs/manual.html#saving.declaration - and the default should be the minimal output that results in unambiguous decoding result, which is why there's a special case for latin1 but not for any other encoding - every other encoding can/should use BOM or default to UTF8 without any extras. I'm going to close this either way, feel free to change the PR to only change the encoding if you need that change. |
Actually, I am not entirely sure how to patch this - Before I send you a pull request, I was hoping you could help me understand why my forked version breaks these validations:
The "Test document_save_file_wide" failed makes be wonder whether line 3621 is the one that triggers a differing behavior under test with my path. Is there maybe an actual bug here? |
In
pugixml.cpp
/xml_document::save
, theimpl::xml_buffered_writer
constructor invoked forbuffered_writer
usesget_write_encoding
on parameterencoding
, which can transformxml_encoding::encoding_auto
into a specific encoding (e.g.encoding_utf8
), stored in buffered_writer.encoding.Accordingly,
buffered_writer.encoding
should be evaluated, while currently only the unmodified function parameterencoding
is checked.Pull request #616 xmldocument::save - use encoding as interpreted by get_write_encoding… fixes this.
The same pull request also adds explicit UTF-8 encoding to the target file, but because this can change how an empty document looks, this requires a change to the
tests/test_document.cpp
, which I don't know how to make because the tests are hard to read due to macros.The text was updated successfully, but these errors were encountered: