Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bots numeric precision limits #393

Open
mrkazoodle opened this issue Feb 27, 2021 · 2 comments
Open

Bots numeric precision limits #393

mrkazoodle opened this issue Feb 27, 2021 · 2 comments

Comments

@mrkazoodle
Copy link

bots/bots/inmessage.py

Lines 185 to 186 in 3277cc1

valuedecimal = float(value)
value = '%.*F'%(lendecimal,valuedecimal)

Bots has numeric precision limits because of for example the code above. To validate fixed numeric input, the value is cast from a string to a float: in case of an error it is logged and processing stops, otherwise the value is formatted back as a string with the original amount of digits.

Impact:

  • Large integers, like SSCC identifiers (18 digits), lose precision
  • Very precise decimals lose precision
  • Hard to find problem: no error because of precision loss
@eppye-bots
Copy link
Owner

to have a SSCC as numeric.....well......even the GS1 people advise to store this an alphanumeric thing.
Yes, it is true that SSCC has only numeric characters.....but this is considered a different thing....
So: bots interpreters 'numeric' as integer or real......IMHO a fairly normal interpretation.
Yes, one can think SSCC is numeric.......do not.....it is a mistake.......it just has numeric characters....
a 'real' has numeric characters.....plus a minus-sign.....and decimal point.....those are not numerical....get the point? different concepts.
so: do not store SSCC as numeric in a database.....nobody does......
I do know it is confusing.

@mrkazoodle
Copy link
Author

Good morning Henk-Jan,

I would definitely try to store SSCC codes as 64 bit integers in a database: the maximum signed value of a Java long for example is 9,223,372,036,854,775,807. A 18 digit integer does fit perfectly, and we have 64 bit hardware for a long time. A string version would need 18 bytes/chars, which is more than double the requirement for a long.

Also, string comparison normally tries to optimise by comparing length of the string, which is always 18, so no luck there. An SSCC code starts with a prefix and the company prefix of at least 7 digits so that +99% of the SSCC strings you'd expect to find will have the same first 8 up to 12 digits. This means that string comparison is again not optimal: even if can compare 8 digits/characters in the same CPU cycle, you would expect to never encounter a difference in the first cycle.

But that is not the point.

The following (very similar) code is said to be efficient

def is_number(s):
try:
float(s)
return True
except ValueError:
return False
From https://stackoverflow.com/questions/354038/how-do-i-check-if-a-string-is-a-number-float

So maybe cast to decimal as an alternative? Or just check if it is a number without using the float to go back to string (not overwriting the original 'value')?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants