-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove HitsThresholdChecker. #13943
Remove HitsThresholdChecker. #13943
Conversation
`TopScoreDocCollectorManager` has a dependency on `HitsThresholdChecker`, which is essentially a shared counter that is incremented until it reaches the total hits threshold, when the scorer can start dynamically pruning hits. A consequence of this removal is that dynamic pruning may start later, as soon as: - either the current slice collected `totalHitsThreshold` hits, - or another slice collected `totalHitsThreshold` hits and the current slice collected enough hits (up to 1,024) to check the shared `MaxScoreAccumulator`. So in short, it exchanges a bit more work globally in favor of a bit less contention. A longer-term goal of mine is to stop specializing our `CollectorManager`s based on whether they are going to be used concurrently or not.
EDIT: at the time of this comment, only wikibigall with a
|
Actually, while I was at it, I also removed
Queries sorted by field seem to benefit from the lesser synchronization across search threads. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a good change to me (other than needing a CHANGES entry?). Nice simplification and looks like a positive performance outcome.
`TopScoreDocCollectorManager` has a dependency on `HitsThresholdChecker`, which is essentially a shared counter that is incremented until it reaches the total hits threshold, when the scorer can start dynamically pruning hits. A consequence of this removal is that dynamic pruning may start later, as soon as: - either the current slice collected `totalHitsThreshold` hits, - or another slice collected `totalHitsThreshold` hits and the current slice collected enough hits (up to 1,024) to check the shared `MaxScoreAccumulator`. So in short, it exchanges a bit more work globally in favor of a bit less contention. A longer-term goal of mine is to stop specializing our `CollectorManager`s based on whether they are going to be used concurrently or not.
Wow, it's been a bigger speedup on nightly benchmarks than on my machine: https://benchmarks.mikemccandless.com/And3Terms.html. |
Our collector managers have a `supportsConcurrency` flag to optimize the case when they are used in a single thread. This PR proposes to remove this flag now that the optimization doesn't do much as a result of apache#13943.
Our top-docs collectors have a dependency on
HitsThresholdChecker
, which is essentially a shared counter that is incremented until it reaches the total hits threshold, when the scorer can start dynamically pruning hits.A consequence of this removal is that dynamic pruning may start later, as soon as:
totalHitsThreshold
hits,totalHitsThreshold
hits and the current slice collected enough hits (up to 1,024) to check the sharedMaxScoreAccumulator
.So in short, it exchanges a bit more work globally in favor of a bit less contention. A longer-term goal of mine is to stop specializing our
CollectorManager
s based on whether they are going to be used concurrently or not.