Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restore copyMatchTry4 #207

Merged
merged 1 commit into from
Jun 14, 2023
Merged

Restore copyMatchTry4 #207

merged 1 commit into from
Jun 14, 2023

Conversation

lizthegrey
Copy link
Contributor

@lizthegrey lizthegrey commented Jun 3, 2023

Fixes #205

cortex a72:

benchmark                        old ns/op     new ns/op     delta
BenchmarkUncompress-16           20.8          20.8          +0.00%
BenchmarkUncompressPg1661-16     1702946       1585558       -6.89%
BenchmarkUncompressDigits-16     124267        111448        -10.32%
BenchmarkUncompressTwain-16      1177542       1002623       -14.85%
BenchmarkUncompressRand-16       9120          9116          -0.04%

benchmark                        old MB/s     new MB/s     speedup
BenchmarkUncompressPg1661-16     221.60       238.00       1.07x
BenchmarkUncompressDigits-16     766.77       854.96       1.12x
BenchmarkUncompressTwain-16      217.74       255.73       1.17x
BenchmarkUncompressRand-16       1798.55      1799.46      1.00x

benchmark                        old allocs     new allocs     delta
BenchmarkUncompress-16           0              0              +0.00%
BenchmarkUncompressPg1661-16     4              4              +0.00%
BenchmarkUncompressDigits-16     4              4              +0.00%
BenchmarkUncompressTwain-16      4              4              +0.00%
BenchmarkUncompressRand-16       4              4              +0.00%

benchmark                        old bytes     new bytes     delta
BenchmarkUncompress-16           0             0             +0.00%
BenchmarkUncompressPg1661-16     184           184           +0.00%
BenchmarkUncompressDigits-16     184           210           +14.13%
BenchmarkUncompressTwain-16      186           186           +0.00%
BenchmarkUncompressRand-16       187           186           -0.53%

ampere:

benchmark                       old ns/op     new ns/op     delta
BenchmarkUncompress-4           8.84          8.86          +0.25%
BenchmarkUncompressPg1661-4     946111        910414        -3.77%
BenchmarkUncompressDigits-4     62239         60205         -3.27%
BenchmarkUncompressTwain-4      599464        576336        -3.86%
BenchmarkUncompressRand-4       4250          4485          +5.53%

benchmark                       old MB/s     new MB/s     speedup
BenchmarkUncompressPg1661-4     398.86       414.50       1.04x
BenchmarkUncompressDigits-4     1530.93      1582.66      1.03x
BenchmarkUncompressTwain-4      427.72       444.88       1.04x
BenchmarkUncompressRand-4       3859.74      3657.60      0.95x

benchmark                       old allocs     new allocs     delta
BenchmarkUncompress-4           0              0              +0.00%
BenchmarkUncompressPg1661-4     4              4              +0.00%
BenchmarkUncompressDigits-4     4              4              +0.00%
BenchmarkUncompressTwain-4      4              4              +0.00%
BenchmarkUncompressRand-4       4              4              +0.00%

benchmark                       old bytes     new bytes     delta
BenchmarkUncompress-4           0             0             +0.00%
BenchmarkUncompressPg1661-4     184           184           +0.00%
BenchmarkUncompressDigits-4     184           184           +0.00%
BenchmarkUncompressTwain-4      184           184           +0.00%
BenchmarkUncompressRand-4       184           184           +0.00%

@lizthegrey
Copy link
Contributor Author

Tests pass if bc1239b is reverted. This failure predates my change.

@pierrec pierrec merged commit ef495ee into pierrec:v4 Jun 14, 2023
@lizthegrey lizthegrey deleted the lizf.copy4 branch June 14, 2023 14:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[arm64] Performance regression from removal of 4x loop decoding
2 participants