Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Really Poor Real-time Factor in Gazebo ROS simulation compared to Fast-RTPS #207

Closed
Michael-Equi opened this issue Jul 17, 2020 · 3 comments

Comments

@Michael-Equi
Copy link

Bug report

Required Info:

  • Operating System:
    • Ubuntu 20.04 (Running on Desktop with intel i7 7700k, GTX 1080, and 32GB RAM) with ROS2 Foxy
  • Installation type:
    • binaries
  • Version or commit hash:
    • 0.7.2-1focal.20200708.052037
  • DDS implementation:
    • Cyclone DDS

Steps to reproduce issue

Run the TB3 gazebo simulation provided by the navigation2 bringup launch file (https://github.com/ros-planning/navigation2/blob/master/nav2_bringup/bringup/launch/tb3_simulation_launch.py).

Expected behavior

There should be no problem maintaining around 0.95-1.0 real time factor on a computer with the specs listed above. Fast RTPS is able to handle that real time factor.

Actual behavior

Realtime factor with cyclone DDS drops to below 0.5 on the TB3 simulation and to 0.1 on my own robot's simulation. Fast-RTPS is able to handle them at 0.95+ real time factor.

Additional information

I have tried increasing the receive buffer size to no effect with the following command sudo sysctl -w net.core.rmem_max=8388608 **net.core.rmem_default=8388608

@eboasson
Copy link
Collaborator

@Michael-Equi I fully agree with you that a real time factor of 0.5 (or 0.1!) is unacceptably bad performance. I've been unable to reproduce your observations, but that may well be because I only have a Ubuntu 20.04 VM in my macbook pro, and so the timing will be very different. But even with this much less powerful setup, both Fast-RTPS and Cyclone DDS, end up with a real-time factor that varies a bit between ~0.8 and ~0.9 while the robot is moving.

I could be that you are running into something similar to eclipse-cyclonedds/cyclonedds#484 (though generally increasing the socket receive buffers does help with that). In any case, current master of Cyclone has fixes for that and more, such that increasing the UDP socket buffer beyond the default is probably not even worth the bother anymore (see eclipse-cyclonedds/cyclonedds#558 which includes eclipse-cyclonedds/cyclonedds#555).

Could you do me a favour and try it with build of current master of ROS 2 — or at least with the current master of cyclonedds and with rmw_cyclonedds_cpp 0.7.2 on top of Foxy? (Note that I had to cherry-pick f95c496 from #187 to solve a problem with a missing odom-to-base_link transform). While the interesting changes are in cyclonedds you do need to rebuild the RMW layer because it uses some unstable interfaces in Cyclone that had a binary incompatible change.

If that doesn't solve the problem, the first order of business will be to determine whether it is eating CPU cycles like crazy or rather sitting and twiddling its thumbs when it should be working (I suspect the latter). Per-thread CPU usage from top would probably be a good thing to collect. The next thing would then likely be looking at the network traffic, either using wireshark or by getting Cyclone DDS to write a trace via export CYCLONEDDS_URI='<Tr><V>finest</><Out>cdds.log.${CYCLONEDDS_PID}</></>' (but beware that these traces are huge and writing them slows things down ...) If it ends up requiring this, would you be willing to share those files with me?

@Michael-Equi
Copy link
Author

Building from source appears to fix the issue, now running at 0.96-1.0 RTF. Thanks!

@eboasson
Copy link
Collaborator

I've just tagged 0.7.0 release candidate that includes the fixes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants