Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[rosbag2_py.Recorder] Should add option to disable signal handling changes #1678

Open
EricCousineau-TRI opened this issue May 29, 2024 · 7 comments · May be fixed by #1685
Open

[rosbag2_py.Recorder] Should add option to disable signal handling changes #1678

EricCousineau-TRI opened this issue May 29, 2024 · 7 comments · May be fixed by #1685
Labels
enhancement New feature or request

Comments

@EricCousineau-TRI
Copy link

Description

We are launching a rosbag2_py.Recorder() instance inside of Python, and related to #1458, we do not want this to affect any surrounding control flow.
Additionally, for now (perhaps prototyping, perhaps long-term), the current service options do not allow us to change the output bag path.

We are able to achieve this with some level of hacking:

recorder = rosbag2_py.Recorder()
storage_options = StorageOptions(
    uri=bag_path,
    storage_id="mcap",
)
record_options = RecordOptions()
record_options.all = True
# WARNING: Recorder.start() will mess with signal handlers.
# We want to undo this.
old_sigint = signal.getsignal(signal.SIGINT)
# rosbag2_py.Recorder.record() contains waiting for calling cancel(),
# so we need to use thread not to stop main thread.
thread = threading.Thread(
    target=recorder.record,
    args=(
        storage_options,
        record_options,
    ),
    daemon=True,
)
thread.start()
# Waiting for a small period of time, and restore our original sigint
# handler.
time.sleep(0.01)
signal.signal(signal.SIGINT, old_sigint)

Related Issues

See #1458

Completion Criteria

There is an option or some mechanism to disable any signal handling within rosbag2_py.Recorder.

Implementation Notes / Suggestions

Something like record_options.install_signal_handlers = False.

Testing Notes / Suggestions

Something like the following seems to be sufficient testing:

# example_recorder_sigint_example.py
rclpy.init()
node = rclpy.create_node("example_node")
publisher = node.create_publisher(std_msgs.msg.String, "/test", 1)

recorder = MyThreadedRecorder()
recorder.start(bag_path)

rate = LoopRate(100.0)
try:
    while True:
        msg = std_msgs.msg.String()
        msg.data = str(time.time())
        publisher.publish(msg)
        rate.sleep()
except KeyboardInterrupt:
    pass
finally:
    recorder.stop()


# example_recorder_sigint_test.py
proc = Popen([example_bin, bag_dir])
try:
    time.sleep(1.0)
finally:
    proc.send_signal(signal.SIGINT)
    proc.wait()
assert proc.poll() == 0
assert os.path.exists(bag_dir))
@EricCousineau-TRI EricCousineau-TRI added the enhancement New feature or request label May 29, 2024
@MichaelOrlov
Copy link
Contributor

@EricCousineau-TRI Could you please clarify on what code version or ROS distro you observing the issue that the original signal handler is not called?
We have changed behavior relatively recently. See

throw;
}
process_deferred_signal();
uninstall_signal_handlers();
}

It should be already available in the Jazzy Jalisco release.

It only may not call your original signal handler if it was settled up as SIG_DFL or SIG_IGN, that we can reconsider and remove it from the ignore list for deferred signal processing.

As regards your request "[rosbag2_py.Recorder] Should add option to disable signal handling changes"

  • No. we will not add that option for the rosbag2. Because rosbag2 needs to gracefully finish its work on signal interrupts anyway.
    Otherwise, sooner or later, one will open another issue that rosbag2 doesn't finish correctly when interrupted by a signal when calling from some weird environment.

@MichaelOrlov MichaelOrlov added wad Work As Designed wontfix This will not be worked on labels May 30, 2024
@MichaelOrlov MichaelOrlov removed wontfix This will not be worked on wad Work As Designed labels Jun 1, 2024
@MichaelOrlov
Copy link
Contributor

@EricCousineau-TRI After some close consideration and prototyping, I think we can add an option to disable signal handling in the Player and Recorder.

However, It will be a new API rather than an extension of the existing one to be able to backport it, at least to Jazzy.

Also, when using a new API with enable_signal_handling = False, the caller shall handle signals on the Python level and call Player.cancel() or Recorder.cancel() respectively after processing signals to stop the player or recorder correctly.

Please referer to the #1685 for details.

@EricCousineau-TRI
Copy link
Author

EricCousineau-TRI commented Jun 2, 2024

@MichaelOrlov Thank you for the thorough information!

Could you please clarify on what code version or ROS distro you observing the issue that the original signal handler is not called?

This is on humble; I see that this does not have the process_deferred_signal() you mentioned, so I can see how it would not get triggered.

As an aside, FWIW, I believe there is a bug where SIGTERM is a typo, and it should have been SIGINT (on line 239):
Filed #1686

It only may not call your original signal handler [...]

In our case, our signal handler is simply the default Python SIGINT signal handler, accessible via

import signal
signal.default_int_handler

After some close consideration and prototyping, I think we can add an option to disable signal handling in the Player and Recorder.

Awesome! Thank you for very much for prototyping and considering this.

To your earlier point of not introducing the option - I understand that the intent is to ensure clean shutdown / cleanup of the signal handler, and signal handling is one option to do so.

However, I would assume that if this were library code, RAII / relying on the calling code to do so is on the onus of the user, e.g. using a with statement, __del__, atexit.register(), etc.

That being said, looking at the functionality of Recorder in both humble and rolling, and the presences of rclcpp::init() and rclcpp::shutdown(), it does seem like this code isn't necessarily strictly library code, but instead a thin wrapper for application code where the entire process is strictly for recording / playback.

Is my read on this correct?

If so, then maybe the better route is to provide a separate, more library-like class (i.e., no global side effects like rclcpp::init() / ::shutdown(), signal handling, etc)?

This could live in rosbag2_py or just in downstream code.

Also, it seems like there may be some disparity between what is offered between rosbag2_cpp and rosbag2_py, both in terms of contract (_cpp seems geared towards library functionality, _py seems geared towards both library and application functionality).

@MichaelOrlov
Copy link
Contributor

@EricCousineau-TRI Yes, your understanding that rorosbag2_py is more application wrapper is correct. At least to some certain extent, especially for such entities like recorder and player.

There are a few reasons for that:

  1. Historically it was designed this way to have a quick and simple interface to the ros2bag CLI interface.
  2. The python users usually a "hackers" who need to write literally a couple lines of code "throw it to the wall" and expect that everything should work.

If we would provide just a library kind of Python classes for the player and recorder, it would be a myriad issues from users.
Starting from "why I instantiated a recorder class run record(..) method and it is seagfaulted?". i.e. forgotten to call rclcp::init(). And finishing that "why services don't work or why player or recorder do nothing after the start?" i.e. forgotten to create executor and spin the recorder or player nodes.
It would be just a huge headache.

I see the real use case when option to disable signal handling in Python API could be useful.
However, no-one yet provided a real-life use case when Python version of the player and recorder designed as real library components without rclcpp::init(), executor and spinning node inside would be useful or when the current implementation of the player or recorder can't be used.
If you have one - please share. It would be good to know and take into consideration.
Another tough decision is naming for such a new library like API for recorder and player classes.

@EricCousineau-TRI
Copy link
Author

However, no-one yet provided a real-life use case when Python version of the player and recorder designed as real library components without rclcpp::init(), executor and spinning node inside would be useful or when the current implementation of the player or recorder can't be used.

In our case, we are recording data from a robot with some high-level control in Python, where we denote things as episodes via Python. To simplify / align with our existing data pipeline, we want a per-episode rosbag, avoiding large amounts of delay from using a subprocess, and ensure that we have minimal side effects so that we can stop/start this 100's if not 1000's of times in a single process without worrying about race conditions.

If you have one - please share. It would be good to know and take into consideration.

This is a very rough export of my colleague's library code that we are using at TRI:
EricCousineau-TRI/repro@9bae5d6

The testing confirms minimal delay from starting/stopping, ensure all data is recorded within that bounds, and that we can ensure that we have no issues with Ctrl+C / SIGINT signaling.

At a very high-level, the use case is roughly something like this in a gymnasium-like API:

for episode_index in episode_indices:
  obs, ... =  env.reset()
  rosbag_recorder.start(get_rosbag_file(episode_index))

  done = False
  while not done:
    act = agent.sample_action(obs)
    obs, ... = env.step(act)

  rosbag.stop()

Inside this loop, we record low-rate data (at rate of agent / env), but we also want related high-rate data, hence this approach.

@MichaelOrlov
Copy link
Contributor

@EricCousineau-TRI Thank you for providing a use-case example, and sorry for my late response.

I forgot to mention in my previous post about the third very important reason why rosbag2_py::Recorder is designed more like an application.
3. Due to the suboptimal performance when creating a node and spinning it in the executor from Python code.
i.e., spinning a node in executor from Python is not the same as spinning it from C++.

As far as I see from your example having Python API to the exact same library kind of recorder and player as we have on the C++ layer will not be helpful in your scenario.
What you really need is an API for the ability to start, stop and init recorder with the new bag filename, perhaps even more generic init with storage_options and recorder_options. Or API for bag split operation with the parameter for a new filename.

Please note that we already have a C++ API for the recorder start and stop as well as bag_split. The only missing parts are the init and extension of the bag_split with the provisioning filename. Also, we will need to bring those APIs to the Python interface.

@EricCousineau-TRI
Copy link
Author

EricCousineau-TRI commented Jun 27, 2024

[...] i.e., spinning a node in executor from Python is not the same as spinning it from C++.
As far as I see from your example having Python API to the exact same library kind of recorder and player as we have on the C++ layer will not be helpful in your scenario.
What you really need is an API for the ability to start, stop and init recorder [...]

Our example does what you mentioned - spinning in C++, and only exposes start (with new bag file) / stop / cancel to Python using storage_options and recorder_options.
See rosbag_recording_py.cc for C++ (incl. the spinning), and rosbag_interface.py for the Python calling code. The only thing missing here is the bag_split, but we do not yet have a need for it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants