This repository has been archived by the owner on Jan 9, 2024. It is now read-only.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
2 bug fixes here and a logging improvement and a testing improvement:
1/ Startup nodes were not being populated correctly when initialising the node manager.
2/ Handling of SlotNotCoveredError was raising the exception one attempt too early, before it had used all allowed TTL.
3/ When a command is executed and it succeeds it's nice not to have a bunch of error exception logs. Instead only log the exception if all attempts are used up, otherwise log a warning. It's useful to be able to differentiate between a command that failed fatally, and one that was ultimately successful, with perhaps some warnings about short lived problems along the way.
4/ It took me forever to work out what kind of cluster I need to run the tests against. It was fairly simple in the end, but I have included a docker_compose.yml file which can be used to start the redis cluster for the tests.
Could you could also give a rationale for picking a random node when a timeout error happens?
I am also curious as to what conditions it is trying to handle. Every time I have seen this in action, picking a different node, just results in a moved error, pointing the client back to the original node. So it's unclear to me why it doesn't just keep trying the correct node, instead of picking an incorrect node only to be redirect back.