Parallelisation support for analyse #435

gowerc · 2024-10-03T15:56:49Z

Closes #370

Some general notes:

I gave up on future() in the end. I just ran into edge case after edge case that made it too difficult to use and test.
I opted for just using parallel but allowing uses to provide their own clusters which can have their functions / libraries already loaded
I then created a helper function make_rbmi_cluster() to make this setup process as smooth as possible
This also allows for the re-use of clusters between runs to avoid having to repeatedly setup and tear down processes which is very expensive
Unfortunately the performance gain was pretty underwhelming, I'm sure it will be better with analysis functions that are more expensive than lm but currently I am only seeing a 30% gain when using large samples e.g. >2000 for less than that it often takes longer to run in parallel. This is mostly due to the IO exchange by the looks of it. R's PSOCK clusters just transfer data exceedingly slowly :(

R/parallel.R

gravesti · 2024-10-04T10:00:26Z

That's a pity that future couldn't work. It's limited to PSOCK now, right?
I made a few comments and will test a bit this afternoon.

gowerc · 2024-10-04T14:04:02Z

Added some additional changes, main one here is that I change analyse to transfer data into the sub-processes via disk instead of via network. I must admit I am a little unsure on this as I'm not sure how reliable parallel file reads are, though seems to work fine in my testing so far ...

I also added a .validate argument to skip the input validation tests that were taking a surprising amount of time

Co-authored-by: Isaac Gravestock <[email protected]> Signed-off-by: Craig Gower-Page <[email protected]>

gowerc · 2024-10-07T08:31:49Z

@gravesti - It's just dawned on me that the usage of RDS files to load data into the parallel processes means we are limiting parallelisation to only function on local machines. I think (at least from the documentation) that PSOCK supports running parallel processes on remote machines which wouldn't work in this case. The RDS loading saves ~3/4 seconds so I'm not sure wether to switch back to network loading or to just put a big warning in the documentation pages or provide a toggle-able option.

gravesti · 2024-10-07T09:53:15Z

@gowerc Have we been testing on a realistic analysis size? If so, then maybe it is ok to be limited to a single machine and just have a warning in the docs.

I guess the same argument applies to not using RDS. Accept the few seconds penalty for less complexity and no restrictions or extra arguments.

gowerc · 2024-10-07T10:07:57Z

Talking to Marcel and Alessandro I think a realistic sample size would be in the order of 1000-2000 e.g. our test code is likely in the right region (e.g. 20-30 seconds in sequential). Though I think a common use case for this will be in tipping point analyses where the same call to analyse() is run 20-40 times with slightly different offsets. This is an awkward one where the answer is really "it depends...".

I think my current assumption would be that people who are using rbmis internal paralisation are likely just speeding up individual runs on their local machines. If a user is getting to the stage where they need to run it across different nodes (say for multiple sensitivity analyses on different sub-endpoints) they are more likely going to be not using rbmi's internal parallisation and instead be parallelising the different runs of rbmi externally to rbmi.

Given how inefficient the internal parallelisation of rbmi is for analyse() the idea of recruiting a whole cluster of nodes seems ridiculous so I guess I'd rather support the optimisation on a local machine.

Then again I fear people will just assume this functionality is fully compatible with parallels remote PSOCK setup and run it anyway, e.g. we are breaking that interface...

I think on balance I'm inclined to leave it as is and add a warning note in the documentation.

gravesti · 2024-10-10T11:29:03Z

@gowerc Will you merge the lintr branch first? Do you still need to update the docs in this PR?

gowerc · 2024-10-11T15:29:56Z

@gravesti - Have merged in the lintr PR and have added an additional warning regarding the psock network stuff.

gravesti

Thanks for all the effort on this one @gowerc !

gowerc · 2024-10-11T15:41:30Z

Urgh, just seen theres a bunch of notes that I need to resolve:

* checking R code for possible problems ... NOTE
analyse : <anonymous> : inner_fun: no visible binding for global
  variable ‘..rbmi..analysis..imputations’
analyse : <anonymous> : inner_fun: no visible binding for global
  variable ‘..rbmi..analysis..delta’
analyse : <anonymous> : inner_fun: no visible global function
  definition for ‘..rbmi..analysis..fun’
Undefined global functions or variables:
  ..rbmi..analysis..delta ..rbmi..analysis..fun
  ..rbmi..analysis..imputations

gowerc · 2024-10-11T15:53:31Z

Ok I think that last commit should address the NOTE

initial

85f40b0

gowerc requested review from nociale and gravesti October 3, 2024 15:56

gowerc assigned gravesti Oct 3, 2024

gowerc added 3 commits October 3, 2024 17:05

name mangling

14a4a18

update vignette example

31e23c3

updated vignette

61c2282

gravesti reviewed Oct 4, 2024

View reviewed changes

R/parallel.R Outdated Show resolved Hide resolved

R/parallel.R Outdated Show resolved Hide resolved

gowerc added 4 commits October 4, 2024 12:41

progress

f91163d

progress

22bccae

export objects via disk

c1a8922

added missing rd file

9eab9c5

Apply suggestions from code review

28f2cff

Co-authored-by: Isaac Gravestock <[email protected]> Signed-off-by: Craig Gower-Page <[email protected]>

gowerc mentioned this pull request Oct 4, 2024

Explore future.apply for parallelisation #434

Closed

This was linked to issues Oct 4, 2024

Speed up tipping point analyses for "simple" delta adjustments #331

Closed

Parallel clusters don't respect environment manipulations (in particular libpaths) #377

Closed

gowerc added 2 commits October 11, 2024 15:47

merge conflict

aa8abba

update warning

a226aea

gravesti approved these changes Oct 11, 2024

View reviewed changes

fix cran note

c0ff3bf

gowerc merged commit 43639fc into main Oct 11, 2024
5 checks passed

github-actions bot locked and limited conversation to collaborators Oct 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallelisation support for analyse #435

Parallelisation support for analyse #435

gowerc commented Oct 3, 2024 •

edited

Loading

gravesti commented Oct 4, 2024

gowerc commented Oct 4, 2024 •

edited

Loading

gowerc commented Oct 7, 2024

gravesti commented Oct 7, 2024

gowerc commented Oct 7, 2024

gravesti commented Oct 10, 2024

gowerc commented Oct 11, 2024

gravesti left a comment

gowerc commented Oct 11, 2024

gowerc commented Oct 11, 2024

Parallelisation support for analyse #435

Parallelisation support for analyse #435

Conversation

gowerc commented Oct 3, 2024 • edited Loading

gravesti commented Oct 4, 2024

gowerc commented Oct 4, 2024 • edited Loading

gowerc commented Oct 7, 2024

gravesti commented Oct 7, 2024

gowerc commented Oct 7, 2024

gravesti commented Oct 10, 2024

gowerc commented Oct 11, 2024

gravesti left a comment

Choose a reason for hiding this comment

gowerc commented Oct 11, 2024

gowerc commented Oct 11, 2024

gowerc commented Oct 3, 2024 •

edited

Loading

gowerc commented Oct 4, 2024 •

edited

Loading