Bugfix/fix move errors #477

JonathanWylie · 2021-08-19T02:27:43Z

2 bug fixes here and a logging improvement and a testing improvement:
1/ Startup nodes were not being populated correctly when initialising the node manager.
2/ Handling of SlotNotCoveredError was raising the exception one attempt too early, before it had used all allowed TTL.
3/ When a command is executed and it succeeds it's nice not to have a bunch of error exception logs. Instead only log the exception if all attempts are used up, otherwise log a warning. It's useful to be able to differentiate between a command that failed fatally, and one that was ultimately successful, with perhaps some warnings about short lived problems along the way.
4/ It took me forever to work out what kind of cluster I need to run the tests against. It was fairly simple in the end, but I have included a docker_compose.yml file which can be used to start the redis cluster for the tests.

Could you could also give a rationale for picking a random node when a timeout error happens?
I am also curious as to what conditions it is trying to handle. Every time I have seen this in action, picking a different node, just results in a moved error, pointing the client back to the original node. So it's unclear to me why it doesn't just keep trying the correct node, instead of picking an incorrect node only to be redirect back.

Some errors are fatal, so we should log the exception. Other errors we will try to handle and try again, in such a case just log a warning unless it is the final attempt. This means successful retries will not result in an exception being logged.

- SlotNotCoveredError, was re-raising the exception one attempt too soon. - Similarly log_exception

populate_startup_nodes is meant to put all the known nodes in the startup nodes list, but when it was called, the nodes list it was using (self.nodes) is not populated yet. So populate startup nodes after initialising self.nodes, otherwise when we choose a random connection (which is chosen from startup nodes), the options are too limited. Often a user will specify a single startup node at the beginning.

Add a docker-compose.yml file configuring a redis cluster instance for running the tests against.

JonathanWylie added 3 commits September 3, 2021 03:35

Don't log so many exceptions.

f731e9d

Some errors are fatal, so we should log the exception. Other errors we will try to handle and try again, in such a case just log a warning unless it is the final attempt. This means successful retries will not result in an exception being logged.

Fix detecting when it is the last attempt

6b4dbf3

- SlotNotCoveredError, was re-raising the exception one attempt too soon. - Similarly log_exception

JonathanWylie force-pushed the bugfix/fix_move_errors branch from 1982253 to 188e71e Compare September 3, 2021 02:37

Make it easier to run the tests.

dffbd52

Add a docker-compose.yml file configuring a redis cluster instance for running the tests against.

JonathanWylie force-pushed the bugfix/fix_move_errors branch from 188e71e to dffbd52 Compare November 11, 2021 04:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bugfix/fix move errors #477

Bugfix/fix move errors #477

JonathanWylie commented Aug 19, 2021 •

edited

Loading

Bugfix/fix move errors #477

Are you sure you want to change the base?

Bugfix/fix move errors #477

Conversation

JonathanWylie commented Aug 19, 2021 • edited Loading

JonathanWylie commented Aug 19, 2021 •

edited

Loading