-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up emit_dist with precomputed length codings for static trees #264
Speed up emit_dist with precomputed length codings for static trees #264
Conversation
it's late here so a proper review will have to wait till tomorrow, but this is looking really good!
That makes sense; they do more other work, so the emit functions are less of a bottleneck.
From a cursory look, you only use those operations on data from the tables right? That's fine: changing the endianness would be problematic when reading from the input stream, but if you just store the data in the opposite order then that's all good. We also test on a big-endian target (s390x) on CI, so it would catch any regressions. |
Right, I only changed |
there are helpers for using the native endianness
So as long as you make sure uses of these functions are "in-sync" you should be good. |
8b81773
to
0e37a75
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed some changes to compute the table at compile time, so it is clear how it gets generated. This has no impact on the runtime.
Some small other things
12d5498
to
8347812
Compare
i added Btw, you can I think this looks good now, we'll likely merge this tomorrow, but first I need to make a release (our releases get audited, and someone did that before the weekend, so it's simpler to get this PR added to the next release). |
@@ -841,15 +841,18 @@ impl Value { | |||
self.a | |||
} | |||
|
|||
pub(crate) fn code(self) -> u16 { | |||
#[inline(always)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usage of #[inline(always)]
is discouraged unless you can actually prove that it improves performance, but it isn't all that important either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It took me a bit of time to be convinced between the equivalence between encode_dist
+ emit_dist
and the old emit_dist
due to the shuffling of code, but I think it is correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it's tricky, I stared at it for a while too and it looks right.
thanks @brian-pane!
Benchmark results:
Notes: