Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

client: Handle server disconnects #11

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

charlesbaynham
Copy link
Contributor

The second half of m-labs/artiq#1361, aiming to handle server disconnects sensibly in the rpc clients.

We previously discussed using similar logic to sync_struct to handle disconnects. sync_struct treats a server disconnect as a normal event and just finishes its task. It also invokes a callback, if supplied.

However, I don't think that works here because neither the Client nor the AsyncioClient have an infinite loop running, from which we can exit on disconnect. Both objects are queried by the user code, and so both need to signal to the user that the server has closed the connection and it will never work again. I think this is what exceptions are for.

I've therefore raised ConnectionAbortedError when the user calls methods that interact with the server but the server has shut down.

Remaining questions are what to do with BestEffortClient? This currently returns None for methods that fail due to connection errors, but a closed socket will never be resolved. Should it attempt to reopen the socket each time it's called?

@charlesbaynham charlesbaynham marked this pull request as draft May 26, 2020 12:02
@charlesbaynham charlesbaynham changed the title WIP client: Handle server disconnects client: Handle server disconnects May 26, 2020
@charlesbaynham charlesbaynham marked this pull request as ready for review May 26, 2020 12:03
@charlesbaynham charlesbaynham marked this pull request as draft May 26, 2020 12:03
@charlesbaynham
Copy link
Contributor Author

@sbourdeauducq would appreciate your thoughts.

@charlesbaynham charlesbaynham force-pushed the handle_server_disconnects branch from 451d09a to 56f508a Compare May 26, 2020 12:05
sipyco/pc_rpc.py Outdated Show resolved Hide resolved
sipyco/pc_rpc.py Outdated Show resolved Hide resolved
@sbourdeauducq
Copy link
Member

This currently returns None for methods that fail due to connection errors, but a closed socket will never be resolved. Should it attempt to reopen the socket each time it's called?

Yes, with the background thread - preferably as soon as the connection fails, not when something is called. Maybe the thread should not terminate and keep monitoring the connection somehow, if that's possible to do at all in Python.

@charlesbaynham
Copy link
Contributor Author

Could keep checking the ping() method of the Server? That's required to exist anyway since otherwise the controller manager kills the server.

@charlesbaynham
Copy link
Contributor Author

@sbourdeauducq
Copy link
Member

The main issue will be with the concurrent access from the thread.

@charlesbaynham
Copy link
Contributor Author

charlesbaynham commented May 27, 2020

OK, I've reimplemented BestEffortClient as discussed. It now has a continuous loop in a separate thread which monitors the connection and reconnects when required. I've also removed a lot of the repeated logic in BestEffortClient in favour of composition, using a Client.

I've exposed the status of all the three clients through is_closed(), which involved adding get_rpc_method_list() to the asyncio client as well (which is probably sensible since both the others have it already).

I've added tests for the new features.

Todo:

  • Release notes
  • Docs

@charlesbaynham
Copy link
Contributor Author

charlesbaynham commented May 27, 2020

If someone is feeling motivated it might be nice to give AsyncioClient the same treatment: I think it could use the logic in Client fairly easily, which would reduce redundancy. But it works already, so I think it's fine for now.

If you agree, I'll squash and write it up in the docs.

sipyco/pc_rpc.py Outdated Show resolved Hide resolved
sipyco/pc_rpc.py Outdated Show resolved Hide resolved
sipyco/pc_rpc.py Outdated
Comment on lines 417 to 519
def get_local_host(self):
raise NotImplementedError
with self.__client_lock:
client_method = getattr(self.__client, name)
try:
return client_method(*args, **kwargs)
except ConnectionError:
self.__mark_invalid()
return None
return tryer
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a bug here: if the Server returns a ConnectionError as the valid, unpacked output of its method this will be misinterpreted as a failed connection. I've added a failing test to highlight this, will have a think as to how to solve.

Copy link
Contributor Author

@charlesbaynham charlesbaynham Jun 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is resolved in the latest commit: Client now raises RPCConnectionError which is a subclass of ConnectionError if an RPC connection error occurs, which allows you to tell the difference between errors arising from the RPC code and errors that were unpacked. This is backwards compatible because it's still a ConnectionError so any code which looks for these will still receive them.

* Handle closed connections from the server.
* Refactor BestEffortClient to reuse logic from Client via composition
* Add a thread to BestEffortClient which regularly checks for connectivity and attempts to reestablish it if it is lost
@charlesbaynham charlesbaynham force-pushed the handle_server_disconnects branch from 17ce284 to ca911b0 Compare June 5, 2020 16:22
@charlesbaynham charlesbaynham marked this pull request as ready for review June 5, 2020 16:23
@charlesbaynham
Copy link
Contributor Author

charlesbaynham commented Jun 5, 2020

I notice that there are no release notes in this package. Could add some?

@charlesbaynham
Copy link
Contributor Author

@sbourdeauducq I noticed this one is still waiting. Do you think it's ready for merge?

@sbourdeauducq
Copy link
Member

Sorry, I don't currently have time to properly review this. This is a rather large and complex change and I'm too busy with the Zynq port.

@charlesbaynham
Copy link
Contributor Author

Sorry, I don't currently have time to properly review this. This is a rather large and complex change and I'm too busy with the Zynq port.

No worries, just making sure it wasn't forgotten

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants