-
Notifications
You must be signed in to change notification settings - Fork 481
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize normalizeLongitude method #373
Comments
You don't need different cases, you can usually use something like
|
If this is still an open issue, I suggest creating a project for it so that we can track this across all language versions. |
Recommended tag: implementation // specification |
I am considering to add to the OLC specification that longitude normalization MUST run in O(1) time. I consider this a security issue. |
Related #370 |
There are still several language versions that normalize via while-loop - for example c, cpp, python - so I'd suggest to keep this open and properly list all language versions that still need an update, as already requested above. |
IMO, normalizing with loops is optimal, especially for longitudes close to the normal form. In any case making this kind of changes without bench-marking (against day to day cases) is not a good idea, I think. |
Benchmarking is not necessary. Using an input of 100e100 creates a DOS security vulnerability. This approach fixes it. Using an extra MOD function increases computation cost by one arithmetic operation. We cannot reasonably expect any platform we are targeting to be harmed by an additional single instruction. |
@fulldecent I see, I wasn't thinking about malicious use but about performance in "standard" cases. Maybe we should keep the best performance in reasonable cases and prevent malicious use by a time-limited algorithm only if the value is too large? |
Quick back-of-the-envelope calculation: The old algorithm has one comparison and one addition/subtraction per loop, plus another comparison for the loop it doesn't need to take. That's ~ 2x+1 operations, with x being the number of "full circles" the non-normalized longitude is off. The new algorithm has two comparisons, plus if necessary two additions/subtractions, two modulo operations. That's a fixed amount of 6 operations in the worst case. Looping is strictly worse if x>2, but adding x=1,2 as special cases to the new algorithm would mean that we'd still loop once or twice (2*2+1=5) after performing four comparisons - two to drop out early, two to loop instead of calculating directly. That means we'd optimize a 6 op algorithm by making it a 7 or 9 op algorithm. |
I benchmarked two version of // the 'loop' version
func normalize(value, max float64) float64 {
for value < -max {
value += 2 * max
}
for value >= max {
value -= 2 * max
}
return value
}
// the 'mod' version
func normalize2(value, max float64) float64 {
if value >= -max && value < max {
return value
}
return math.Mod(value+max, 2*max) - max
} The // threshold for the normalization method
const toobig float64 = 2e4
func normalize(value, max float64) float64 {
// value's period
period := 2 * max
// if the value is too big use modulo
if value < -toobig || value > toobig {
return math.Mod(value+max, period) - max
}
// else use only additions to normalize
for value < -max {
value += period
}
for value >= max {
value -= period
}
return value
} |
Holding your phone closer to your face will make the calculation appear faster because the photons have less distance to travel to your eyes. That is roughly the magnitude of speed difference between these two algorithms in normal operating cases. In extreme cases (the attack scenario) one algorithm will execute "immediately" and the other will take longer on modern hardware than the lifespan of the Sun. |
I'm not seeing the same when normalizing an array with 1M random entries using the original "non-optimized, looping" algorithm vs. the "non-looping" algorithm in Kotlin/JVM. If entry values are at least somewhat reasonable (tested with +/- 500, 1000, 10000), looping over the array with either algorithm finishes in about the same time with negligible difference. If entry values are decidedly not reasonable (tested with +/- 5000000), the looping algorithm takes 50x as much time, which is exactly the DOS-style behaviour that prompted this whole issue. At this point, I wonder what exactly we're optimizing for? If it is individual plus code encodings, even a 400% increase might not be noticed by anyone if we're talking about microseconds, so avoiding DOS and code readability should be priority. If it is encoding millions of entries we're concerned with, it might be better to sanitize those entries before encoding them. For what it's worth, consider that applications such as Google Maps simply refuse to deal with non-normalized coordinates as input, so something like that is not really a concern. |
This is not an optimization. This is a fix for a specific problem. Anybody can optimize the time it takes to get in their house by removing the door. Converting 1M Plus Codes is not a normal use case. That is a batch operation run in a server non-interactively. If this was run interactively, then again the server would have more resources, parallel task running and again this would be on the order of time of a photon moving between a phone and an eyeball. |
Reopening to track fixes for other languages. |
Currently, our normalizeLongitude() methods use a while loop to get the longitude into an acceptable range. However, this can take a long time if the integrating application passes in a really large value for the longitude. We can resolve this while maintaining the same behavior by computing the normalized longitude in a single step. Something like:
This is the master issue for tracking the work to be done across the various implementations. See #370 for more context.
The text was updated successfully, but these errors were encountered: