-
Notifications
You must be signed in to change notification settings - Fork 279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(doTransform): 10-100x faster transform of pointcloud #432
base: noetic-devel
Are you sure you want to change the base?
refactor(doTransform): 10-100x faster transform of pointcloud #432
Conversation
It seems like generating an Eigen::Vector3f and transforming it for each iteration is very inefficient. In our case, this mod took transform-time of a point-cloud from 90-100ms to 1-2ms. Verified to give same result as old doTransform using: ``` sensor_msgs::PointCloud2ConstIterator<float> x_old(cloud1, "x"); sensor_msgs::PointCloud2ConstIterator<float> y_old(cloud1, "y"); sensor_msgs::PointCloud2ConstIterator<float> z_old(cloud1, "z"); sensor_msgs::PointCloud2ConstIterator<float> x_new(cloud2, "x"); sensor_msgs::PointCloud2ConstIterator<float> y_new(cloud2, "y"); sensor_msgs::PointCloud2ConstIterator<float> z_new(cloud2, "z"); std::vector<double> compare_vector; for (; x_new != x_new.end(); ++x_new, ++y_new, ++z_new, ++x_old, ++y_old, ++z_old) { compare_vector.push_back(*x_new - *x_old); compare_vector.push_back(*y_new - *y_old); compare_vector.push_back(*z_new - *z_old); } double max_diff = *max_element(compare_vector.begin(), compare_vector.end()); double min_diff = *min_element(compare_vector.begin(), compare_vector.end()); ROS_INFO("Biggest differences: %f, %f", max_diff, min_diff); ``` Not sure how/where to put this test-code in a proper test, so I'll leave it here until I get some feedback.
Hi :) Thank you for making this PR. The main problem of the original code is that It may be possible to speed up the original code by initializing the
Using |
Hi
It seems the "point = t * point_in"-line is the bottleneck, clocking in at ~5000ns, with the other lines inside the loop around 100-200ns each. In comparison, each of the three codelines in the loop in my committed code take around 50-100ns. (Of course, individual time measurements should probably be taken with a grain of salt when the durations are this small, but the orders of magnitude should be reliable) I'm not familiar enough with parallel computing to say anything about how that might affect the performance of the different approaches, so I'll leave that discussion for someone who unlike me knows what they're talking about. |
I'm very sorry for my late response. Thank you for your benchmark. |
Signed-off-by: Shivam Pandey <[email protected]>
It seems like generating an Eigen::Vector3f and transforming it for each iteration is very inefficient. In our case, this mod took transform-time of a point-cloud from 90-100ms to 1-2ms.
Verified to give same result as old doTransform using:
Not sure how/where to put this test-code in a proper test, so I'll leave it here until I get some feedback.