-
Notifications
You must be signed in to change notification settings - Fork 891
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GODRIVER-2101 Direct read/write retries to another mongos if possible #1358
Conversation
return nil, err | ||
} | ||
|
||
filteredServers := filterDeprioritizedServers(selectedServers, oss.deprioritizedServers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method was designed to add more filters if the need arrises in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, with one question about the behavior when CSOT is enabled 👍
// Note that setting this value greater than 2 will result in false | ||
// negatives. The current specification does not account for CSOT, which | ||
// might allow for an "inifinite" number of retries over a period of time. | ||
// Because of this, we only track the "previous server". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a task for updating the retryable reads/writes "deprioritized mongos" behavior to account for multiple retries (i.e. CSOT)? The vast majority of sharded clusters have >2 mongos nodes, so that seems like a questionably useful feature for drivers that support CSOT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now, there is no task to do this. Here are a couple of reasons from discussions with @comandeo:
- We do not want this new mechanism to replace SDAM/interfere with SDAM too much.
- We believe that mongos may recover from the error fast enough, and there is no reason to exclude ones that failed earlier
- It is rather a rare occasion that multiple mongoses fail with retryable errors. This looks like a network issue, and this is handled by SDAM
943ecbe
API Change ReportNo changes found! |
GODRIVER-2101
Summary
When possible, deprioritize failed mongos during retry attempts.
Background & Motivation
Our current retry logic for sharded clusters can lead to an operation that failed with a retryable error being retried on the same mongos.