Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poor performance of ROS1 Node compared to OrbeccViewer #53

Open
marcomasa opened this issue Oct 16, 2024 · 4 comments
Open

Poor performance of ROS1 Node compared to OrbeccViewer #53

marcomasa opened this issue Oct 16, 2024 · 4 comments

Comments

@marcomasa
Copy link

I am unsure how to describe this perfectly, but I am noticing incredibly large performance differences between the ROS1 node and the standalone OrbeccViewer.

In the OrbeccViewer, I am able to effortlessly stream 4K color and max res. depth images, merge them to a PCL and have somewhat stable 30fps throughout the session.

When I launch the minimal configuration (1280x720), I get very unstable recordings (on various machines) and even complete frame freezes for certain scenes. Is this a known issue?

@jian-dong
Copy link
Contributor

hi @marcomasa
The performance differences you're noticing between the ROS1 node and the standalone OrbeccViewer are primarily due to how data is handled and processed in each case.

In the OrbeccViewer, the software directly accesses and renders the data with minimal overhead, which allows it to handle high-resolution streams like 4K and depth images efficiently, maintaining stable frame rates.

On the other hand, the ROS1 node introduces additional overhead related to data transmission, such as publishing and subscribing to topics. This includes serialization, deserialization, and the time it takes to transmit data between processes, which can contribute to performance issues, especially at higher resolutions. The frame freezes and instability you're observing could be due to these transmission overheads and potentially how ROS handles the processing pipeline in your specific setup.

If performance is critical, optimizing the ROS pipeline, reducing message sizes, and offloading heavy computations outside ROS nodes could help mitigate some of these issues.

@marcomasa
Copy link
Author

Thank you for the details!

I noticed that the output format of recordings with the Windows Orbecc Viewer are still ROS1 bags,
which show no frame drops when I record on the same machine.

This would (for the while) be a valid alternative for my use case, but unfortunately there is
no option to record the image streams and IMU data at the same time.
I can enable and stream them both, but there is no option to add the data to the bag.

Is it planned to add this option to the standalone viewer?

@bmegli
Copy link

bmegli commented Oct 21, 2024

@marcomasa

Just some hints, not all may apply to to your case.

If you look at example sensor workflow, say Femto Bolt

Data from sensor

Native data at higher resolutions is MJPEG:

image

ROS driver

This data is decompressed by ROS driver

If you are lucky (Nvidia or Rockchip) you may try using hardware decoder:

ROS driver workflow

If you try to record in ROS compressed data through image_transport (like compressed) you end up with workflow

  • compressed data from sensor
    • decompressed by Orbbec SDK ROS (+ maybe color conversion)
      • compressed again by ROS image transport

Unless care has been taken by ROS driver writer:

  • internal decompressing will block driver processing
  • image_transport will block driver processing while compressing

On the other hand If you try to record in uncompressed format

  • you still have internal decompression of data
  • at higher resolutions those are huge amounts of data

What might be done

You might shortcut ROS driver to publish MJPEG data directly (this is like compressed image_transport) in ROS.

  • after checking with 2-3 lines of code how much time decompression/compression takes, it may not be worth the effort

You might also try to decouple compression from driver by image_transport republish.

Finally using nodelets instead of nodes may eliminate some unnecessary data transmission

At higher resolutions those are really huge amounts of data

  • this means gigabits per second
    • e.g. 4K 3840*2160 = 8294400 * 4 bytes per pixel = 33177600 bytes
      • 33177600 * 30 frames per second = 995328000 bytes per second
        • 995328000/1024 = 972000 KB per second
          • 972000/1024 = 949,21875 MB per second

Higher resolutions in video processing are generally not suitable for uncompressed transmission/storage.
They may also be not suitable for software processing at high framerate.

Unless you are careful, it is very easy to have bottleneck in the workflow.

@marcomasa
Copy link
Author

Hi @bmegli !

First of all, thank you so much for your detailed input!


If you are lucky (Nvidia or Rockchip) you may try using hardware decoder:

Regarding the HW decoding flags, I unfortunately do not run the camera with a Jetson or Rockchip HW board,
so i cannot use them. On the main system I even run in a WSL2 environment (see also #48 ), so i might even lack additional drivers. However, I also tested on a native linux machine, and I still got very high CPU usage for the Linux driver.
I might add system info and running specs later.


ROS driver workflow

If you try to record in ROS compressed data through image_transport (like compressed) you end up with workflow

  • compressed data from sensor

    • decompressed by Orbbec SDK ROS (+ maybe color conversion)

      • compressed again by ROS image transport

I am actually recording non-compressed data at the moment.
And yes, you are right, the amounts of data are huge, but enabling compression took down the actual framerate even further (like somewhat to be expected).


What might be done

You might shortcut ROS driver to publish MJPEG data directly (this is like compressed image_transport) in ROS.

  • after checking with 2-3 lines of code how much time decompression/compression takes, it may not be worth the effort

Do you have the lines / the timing somewhere available ? :)


Finally using nodelets instead of nodes may eliminate some unnecessary data transmission

I will try setting up launchfiles for that. I think orbecc already provides the nodelets in general, but I have not seen any launching configurations in the repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants