-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallelize the Packer #11
Conversation
Excellent! Give me a day or so to go over things and do a few checks, but at a cursory glance this looks good. Thanks for your contribution! (#9) |
Codecov Report
@@ Coverage Diff @@
## master #11 +/- ##
==========================================
+ Coverage 93.23% 93.45% +0.22%
==========================================
Files 5 5
Lines 266 275 +9
==========================================
+ Hits 248 257 +9
Misses 18 18
Continue to review full report at Codecov.
|
I ran into this problem when I did some quick rayon testing myself - doesn't look too hot with the current implementation... I've taken the current master:
parallel branch:
I'm using a 14 core 2690v4, so I expect some overhead cost - especially on a 15 second single core run, but at the same time I thought rayon should choose to paralellise over multiple workers or not, depending on situation. |
The same with sphere sizes of 0.1 to 0.3 master:
parallel:
So it's not a spool up/down issue, and I can test on the quicker, 15 second benchmark... |
src/lib.rs
Outdated
@@ -432,14 +442,16 @@ fn identify_f<C: Container>( | |||
)?; | |||
|
|||
// Make sure the spheres are bounded by the containing geometry and do not overlap any spheres in V | |||
if container.contains(&s_4_positive) && !set_v.iter().any(|v| v.overlaps(&s_4_positive)) { | |||
f.push(s_4_positive); | |||
if container.contains(&s_4_positive) && !set_v.par_iter().any(|v| v.overlaps(&s_4_positive)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dropping this par_iter
and the one directly below gives me 18 seconds on the first benchmark and 2 mins 47 secs on the second. So not that much improvement yet...
By dropping the two This is certainly not my strong suit, so if you have other suggestions or ideas I'm very happy to consider them. |
&& nalgebra::distance(&curr_sphere.center, &s_dash.center) | ||
<= curr_sphere.radius + s_dash.radius + 2. * new_radius | ||
}) | ||
.cloned(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well that totally makes sense. :) I'm out of town today, so will run those quick benches tomorrow when I get home and see how they compare.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I'm not getting any timing difference with this change at all. Do you have any benches of your own that I can take a look at? Perhaps it's only my machine?
Also get rid of a few allocations.