-
-
Notifications
You must be signed in to change notification settings - Fork 492
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes parsing bug for floats with non english culture #794
Fixes parsing bug for floats with non english culture #794
Conversation
Adds `YamlFormatter.NumberFormat` in `Parse` methods in `ScalarNodeDeserializer.AttemptUnknownTypeDeserialization`. This caused tests to fail in non english culture (Tested with "es-AR")
The reason it doesn’t use tryparse is because it is not available in older frameworks like 3.5 which we currently support. I’m not entirely sure when the tryparse methods were added, but there may be other framework versions we compile against that don’t have it either. I need to do some personal checks with this pr before I can merge it in. My biggest concern is if people are using this library on a culture other than English and they are expecting that to work in the future then we change the culture and suddenly it no longer works with their yaml. This could be a pretty impactful change so I’m being very cautious with merging it in. |
Oh, I understand. That's totally fine. Also there are a lot of allocations in other parts of the parser so a few extra allocations don't make any difference. In a future maybe it's possible to wrap the
That's a very good observation. Although I doubt someone is relying in that behaviour because it's super specific to specific cultures that use comma instead of dot for decimal separator, but it's worth it to investigate before merging. |
A bit out of topic, but having such a wide set like: <TargetFrameworks>net70;netstandard2.0;netstandard2.1;net35;net40;net45;net47;net60</TargetFrameworks> Might just cause unnecessary maintenance burden. People left maintaining NET 3.5 era software aren't supposed to do library upgrades anyway trying to keep the program running (just too unsafe). I believe many OSS projects have moved to supporting the oldest full framework that Microsoft supports, which is Just my two cents and thank you all who are maintaining and improving this library! |
According to the compatibility table at the bottom of the dotnet documentation, the
|
I modified the targets to support only supported .net versions last year. 3.5 was still on that list and it made me sad. I do agree on wishing it wasn’t and removing it. I’ve thought about cutting ties to that and 4.0 and potentially 4.5. Those last 2 are there for unity support if I recall. |
I don't understand much about Unity as I don't develop for it (but Jint does support it, and a lot of people use Jint with Unity), it seems that Unity's .NET profile support basically states that |
So I just looked at this, I'm a bit nervous about forcing a specific culture/format in those parse methods instead of using the local machine culture like it currently does. I don't think that it is the best idea to do that. The fix I would prefer is to set the locale in the test project. Which is actually easy to do. Add a
<?xml version="1.0" encoding="utf-8"?>
<!-- File name extension must be .runsettings -->
<RunSettings>
<RunConfiguration>
<EnvironmentVariables>
<!-- List of environment variables we want to set-->
<LANG>>en_US.UTF-8</LANG>
<LANGUAGE>en_US:en</LANGUAGE>
<LC_ALL>en_US.UTF-8</LC_ALL>
</EnvironmentVariables>
</RunConfiguration>
</RunSettings> csproj property: <RunSettingsFilePath>$(MSBuildProjectDirectory)\.runsettings</RunSettingsFilePath> I tested this fix in a docker image set to English/Denmark (which uses a Dockerfile: FROM mcr.microsoft.com/dotnet/sdk:7.0
RUN apt update
RUN apt install -y vim
RUN apt install -y locales
RUN sed -i '/en_DK.UTF-8/s/^# //g' /etc/locale.gen && \
locale-gen
RUN useradd edward -d /source -m -u 1000 -U
ENV LANG en_DK.UTF-8
ENV LANGUAGE en_DK:en
ENV LC_ALL en_DK.UTF-8
USER edward
WORKDIR /source |
I understand that you are afraid of breaking existing apps that upgrade. But in the other hand, let me explain my use case: I think the best option would be to follow the yaml specs respect to the parsing of float point numbers and bump the major number in the semver to signify a major breaking change (although it shouldn't be a breaking change if you are following the yaml specs). It's just my point of view. Btw: very smart idea to create a custom docker image for that. That didn't cross my mind. |
IMHO a YAML document should be always use the invariant culture. |
I’ve been thinking about this over the past couple of days. A solution that would be pretty good is to be able to specify the culture when creating the serializer and deserializer. Defaulting to the yaml spec format. That way if people need it to be the current machine culture or whatever other one. How does that sound? |
@EdwardCooke That looks fine to me. Although I can't imagine a single use case of someone that wants to encode floats in a different encoding than the YAML spec. I wouldn't overcomplicate things and I would only support the YAML spec (culture independent parsing). |
Were you going to make the requested change? |
Sorry, I was awaiting for your answer before doing anything. How would you like to approach this? |
Follow my last comment, that would be great. |
Any progress on this? I can try and work on it over the next week or 2. |
Sorry. Currently I'm pretty busy with work and with health issues.
El mar., 22 ago. 2023 1:51 a. m., Edward Cooke ***@***.***>
escribió:
… Any progress on this? I can try and work on it over the next week or 2.
—
Reply to this email directly, view it on GitHub
<#794 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB46I77OWBMVMH3F62JYFXDXWQ3DVANCNFSM6AAAAAAWMWZF7I>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
No worries, I'll get it fixed. I'm going to close this PR, thanks for your attempt. |
Sorry. I really love this project and definitely will make more PRs in the future (maybe I can help with the Wiki or docs). But right now is not the best moment |
Not a problem. Health always takes priority! |
It seems |
Nope. Sure isn’t. |
@EdwardCooke It's because of following reasons.
|
Summary
This PR fixes a serious bug that caused an incorrect parsing of floats and doubles in machines with non english cultures.
The bug was reproduced with Windows 11 setting the culture to
Spanish (Argentina)
. If the culture isEnglish (United States)
the parsing works fine.This PR fix a bug related to #792. For a full explanation refer to that issue.
Fix
The fix adds
YamlFormatter.NumberFormat
format provider inParse
methods inScalarNodeDeserializer.AttemptUnknownTypeDeserialization
.English (United states)
but failed inSpanish (Argentina)
.English (United states)
culture andSpanish (Argentina)
culture.Extra comments
Question: Why does this part of the code does not use
TryParse
? UsingParse
and theTryAndSwallow
method causes unnecessary allocations for the exceptions. I think it would be an easy change.