-
Notifications
You must be signed in to change notification settings - Fork 891
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GODRIVER-2914 x/mongo/driver: enable parallel zlib compression and improve zstd decompression #1320
GODRIVER-2914 x/mongo/driver: enable parallel zlib compression and improve zstd decompression #1320
Conversation
60ed52e
to
6463465
Compare
6463465
to
b84a5ef
Compare
@matthewdale the big TLDR here is that the current zlib compressor serializes all outbound messages to MongoDB because there is only one shared zlib encoded, which is rightfully protected by a mutex. This represents a serious performance bottleneck for users of this library. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great performance improvements! Looks good pending a few requested maintainability changes.
x/mongo/driver/compression.go
Outdated
// The level is invalid so call zstd.NewWriter for the error. | ||
return zstd.NewWriter(nil, zstd.WithEncoderLevel(level)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the valid levels are ever updated, this could mask a bug that always creates a new writer. It's better to return an explicit error instead.
// The level is invalid so call zstd.NewWriter for the error. | |
return zstd.NewWriter(nil, zstd.WithEncoderLevel(level)) | |
// The level is outside the expected range, return an error. | |
return nil, fmt.Errorf("invalid zstd compression level: %d", level) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, we test against this and zstd.WithEncoderLevel
prevents this from happening but I think it's a good change incase there is a ever a regression in the zstd
lib.
x/mongo/driver/compression.go
Outdated
@@ -26,48 +26,70 @@ type CompressionOpts struct { | |||
UncompressedSize int32 | |||
} | |||
|
|||
var zstdEncoders sync.Map // map[zstd.EncoderLevel]*zstd.Encoder | |||
func zstdNewWriter(lvl zstd.EncoderLevel) *zstd.Encoder { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm concerned someone may use this function at runtime and not realize it panics instead of returning an error. We should add a comment describing the expected use.
func zstdNewWriter(lvl zstd.EncoderLevel) *zstd.Encoder { | |
// zstdNewWriter creates a zstd.Encoder with the given level and a nil | |
// destination writer. It panics on any errors and should only be used at | |
// package initialization time. | |
func zstdNewWriter(lvl zstd.EncoderLevel) *zstd.Encoder { |
An alternative would be to put all this initialization logic in an init
function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think renaming zstdNewWriter
to mustZstdNewWriter
should make it clearer that this will panic and follows the general naming convention of Go functions that will panic.
The issue with putting this in an init
function is that it the order in which init
function calls is a bit tricky (though now formalized with go1.21) this can make it dangerous for any other init
to use any of the code here that relies on the zstdEncoders
.
I'm also of the opinion that without a really really strong use-case packages should never have an init
function (at Lyft I banned the use of init
functions in libraries unless there was an excellent case for it because of the trouble they caused - ran Go development there).
If we really wanted to doe this without exposing this function we could use this beast, but IMHO it's not
var zstdEncoders = func() [zstd.SpeedBestCompression + 1]*zstd.Encoder {
f := func(lvl zstd.EncoderLevel) *zstd.Encoder {
enc, err := zstd.NewWriter(nil, zstd.WithEncoderLevel(lvl))
if err != nil {
panic(err) // this should never happen
}
return enc
}
return [zstd.SpeedBestCompression + 1]*zstd.Encoder{
0: nil, // zstd.speedNotSet
zstd.SpeedFastest: f(zstd.SpeedFastest),
zstd.SpeedDefault: f(zstd.SpeedDefault),
zstd.SpeedBetterCompression: f(zstd.SpeedBetterCompression),
zstd.SpeedBestCompression: f(zstd.SpeedBestCompression),
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renaming it to mustZstdNewWriter
sounds like a great idea!
x/mongo/driver/compression.go
Outdated
// The level is invalid so call zlib.NewWriterLever for the error. | ||
_, err := zlib.NewWriterLevel(nil, level) | ||
return nil, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the valid levels are ever updated, this could result in unintentionally returning nil, nil
. It's better to return an explicit error instead.
// The level is invalid so call zlib.NewWriterLever for the error. | |
_, err := zlib.NewWriterLevel(nil, level) | |
return nil, err | |
// The level is outside the expected range, return an error. | |
return nil, fmt.Errorf("invalid zlib compression level: %d", level) |
…ompression This commit fixes a bug where zlib compression was serialized across all goroutines. This occurred because only one shared zlib decompresser was instantiated for each compression level which had to be locked when used and thus preventing concurrent compression. The decompression performance of zstd is also improved by using a pool of zstd decoders (instantiating a zstd encoded or decoder is fairly expensive). It also slightly cleans up the logic used to store and acquire zstd encoders. The CompressPayload benchmark is changed to use a stable payload, which is now stored in testdata/compression.go. Previously it used the compression.go file itself, which made benchmarks results confusing and liable to change whenever the compression/decompression logic changed. ``` goos: darwin goarch: arm64 pkg: go.mongodb.org/mongo-driver/x/mongo/driver │ base.20.txt │ new.20.txt │ │ sec/op │ sec/op vs base │ CompressPayload/CompressorZLib-10 5387.4µ ± 0% 651.1µ ± 1% -87.91% (p=0.000 n=20) CompressPayload/CompressorZstd-10 64.56µ ± 1% 64.10µ ± 0% -0.72% (p=0.000 n=20) DecompressPayload/CompressorZLib-10 125.7µ ± 1% 123.7µ ± 0% -1.60% (p=0.000 n=20) DecompressPayload/CompressorZstd-10 70.13µ ± 1% 45.80µ ± 1% -34.70% (p=0.000 n=20) geomean 235.3µ 124.0µ -47.31% │ base.20.txt │ new.20.txt │ │ B/s │ B/s vs base │ CompressPayload/CompressorZLib-10 365.2Mi ± 0% 3021.4Mi ± 1% +727.41% (p=0.000 n=20) CompressPayload/CompressorZstd-10 29.76Gi ± 1% 29.97Gi ± 0% +0.73% (p=0.000 n=20) DecompressPayload/CompressorZLib-10 15.28Gi ± 1% 15.53Gi ± 0% +1.63% (p=0.000 n=20) DecompressPayload/CompressorZstd-10 27.39Gi ± 1% 41.95Gi ± 1% +53.13% (p=0.000 n=20) geomean 8.164Gi 15.49Gi +89.77% │ base.20.txt │ new.20.txt │ │ B/op │ B/op vs base │ CompressPayload/CompressorZLib-10 14.02Ki ± 0% 14.00Ki ± 0% -0.10% (p=0.000 n=20) CompressPayload/CompressorZstd-10 3.398Ki ± 0% 3.398Ki ± 0% ~ (p=1.000 n=20) ¹ DecompressPayload/CompressorZLib-10 2.008Mi ± 0% 2.008Mi ± 0% ~ (p=1.000 n=20) ¹ DecompressPayload/CompressorZstd-10 4.109Mi ± 0% 1.969Mi ± 0% -52.08% (p=0.000 n=20) geomean 142.5Ki 118.5Ki -16.82% ¹ all samples are equal │ base.20.txt │ new.20.txt │ │ allocs/op │ allocs/op vs base │ CompressPayload/CompressorZLib-10 1.000 ± 0% 1.000 ± 0% ~ (p=1.000 n=20) ¹ CompressPayload/CompressorZstd-10 4.000 ± 0% 4.000 ± 0% ~ (p=1.000 n=20) ¹ DecompressPayload/CompressorZLib-10 26.00 ± 0% 26.00 ± 0% ~ (p=1.000 n=20) ¹ DecompressPayload/CompressorZstd-10 104.000 ± 0% 1.000 ± 0% -99.04% (p=0.000 n=20) geomean 10.20 3.193 -68.69% ¹ all samples are equal ```
b84a5ef
to
51a6c7b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Matt, thank you for taking the time to fix this PR up and merge it! |
…prove zstd decompression (#1320) Co-authored-by: Matt Dale <[email protected]>
…n and improve zstd decompression (mongodb#1320)" This reverts commit 84a4385.
GODRIVER-2914
Summary
x/mongo/driver: enable parallel zlib compression and improve zstd decompression
This commit fixes a bug where zlib compression was serialized across all
goroutines. This occurred because only one shared zlib decompresser was
instantiated for each compression level which had to be locked when used
and thus preventing concurrent compression.
This commit also slightly improves zstd decompression performance by
using a pool of zstd decoders (instantiating a zstd encoded or decoder
is fairly expensive). It also slightly cleans up the logic used to store
and acquire zstd encoders.
Background & Motivation
N/A