-
-
Notifications
You must be signed in to change notification settings - Fork 134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Include stan_surv
in CRAN release of rstanarm?
#570
Comments
Yes! I've been wanting to release this on CRAN for ages but I think there was some compilation related issue that was preventing it (@bgoodri would know much better than I do). Ben, what's the current situation? Are we able to include this is a release soon? |
If I remember correctly, the issue was that the produced assembly was too big for 32-bit Windows, which meant that CRAN could not build binaries for 32-bit Windows. However, CRAN has since dropped the requirement to support 32-bit Windows once R 4.1 was released. So we might be able to get this on CRAN now. I am planning to update the rstanarm binaries on mc-stan.org/r-packages this weekend for R4.1 and R4.2. Hopefully will be able to also make a Github Actions workflow so that this becomes less manual and can be triggered by anyone with permissions. |
The |
If it is too difficult to make this work inside |
@rok-cesnovar Yeah sure is. I managed to merge in master and solve the merge conflicts. I can't build from source though so wasn't able to test things work, so comes with that disclaimer. But otherwise feel free to work from that branch. 👍 |
Yeah I've pondered on this many times in the past. But I think it might be a bit hard to maintain as an entirely separate package, including the refactoring to work out how to avoid having to duplicate all the external postestimation and/or internal util functions in both packages! |
The 32bit thing was an issue, although we had always been able to barely get around it with rstanarm. The bigger deal for CRAN was the compilation time and RAM especially on Windows, but that seems to have been lessened by the additions to src/Makevars.win that utilize LTO. It does not take so long to compile any one model into the intermediate representation and both that and the linking / optimization can be done in parallel. And it doesn't seem to have broken anything. So, I think we can start to add Stan programs, starting with the survival and CAR models that have been sitting around for ages. |
Wonderful! What is the minimum of steps, that remains to be done? Can we agree on a minimal number and set of tests? I could help to choose some, by comparison with reference models (e.g., via coxph in the survival R package, conjugate distributions were possible with analytical posterior, to name just a few). Also, we could look at the growing number of publications, citing Lastly, testing should focus on parts that we think have a high likelihood of containing bugs. So we could start defining tests, based on known bugs the . Another incentive to fix those bugs, soon! PS: sorry that “my” PhD student didn’t work on the topic in the end. |
I think “the big elephant in the room” is that with @sambrilleman, the original author of the function, not being able to (pro-)actively work on it, we need someone that familiarises him-/herself with the code, in order to contribute and move stan_surv finally into stable rstanarm. And that requires time, focus, supervision and efforts, and thus needs a strong incentive. Which isn’t apparent to me. I like the proposal to have a phd student work on it, but my example showed that the incentives for her to work on stan_surv were too low, given other tasks and their respective incentives. What are the incentives? Is there an opportunity to joinedly sponsor a student? |
My impression was that the unit testing was not as extensive as for the rest of rstanarm. It would be good if SBC has been used to make sure that the Stan programs are working as intended, but that isn't something that we have been including in rstanarm releases. |
Perhaps this would be a good place to start, by adding unit tests and/or addressing some of the lingering bugs. This would leave the SBC-based testing (beyond maybe a simple test) for an external library or project to tackle.. |
I think lack of clarity around this is one of the biggest challenges to moving this forward. @bgoodri , can you elaborate here? |
I can echo @jburos remark about the need for more clarity on requirements. I've discussed the topic also with a member from the openstatsware working group, a scientific working group of the American Statistical Association (ASA) Biopharmaceutical section (BIOP). See here. There is the possibility to propose the topic there, essentially requesting support and resources from them. However, for this, we would need a clearer definition of the work package to be done, particularly acceptance criteria for addition to stable rstanarm. |
So to provide a couple of updates, given that @andrjohns might have some time to try and look at getting this over the line! 🥳 🙏 💪 About 4 months ago I spent a day or so trying to get it ready to be merged. I got the branch up to date with I got stuck on a couple of tests, I can't remember exactly but I think they were to do with frailty models or maybe AFT models. I've just triggered a new Actions build on #323 to see, since the Actions logs from the previous build have been removed (they are only stored for a finite period of time), so we can inspect the logs and find more info after the build finishes. The tests that failed are not deterministic - they simulate some data, then try to assert estimates are within some tolerance of the known true value. I can't recall if setting the seed ensures reproducibility, I'd hope it does. But in any case, a realistic tolerance is always going to be sensitive to the sample size of the simulated data. But increasing sample sizes mean the tests can take an uncomfortably long time to run, for things like frailty models (i.e. random effects). There are also some blocks with So yeah, there might be a couple of remaining tests to get green. And then it would be great to set up a way to run the SBC testing in a feasible time frame - e.g. being able to deploy it to run on parallel processes or machines on any available infra that Stan dev or public have available. The slow fitting models are probably the biggest pain point when trying to write tests for this package. And then maybe looking at a couple of the existing recorded bugs in the Issues, if they are deemed worthy of fixing before an initial release. It feels so close except for those couple of red tests. |
Summary:
It would be amazing if we could finally merge the
feature/survival
branch into rstanarm, per this discussion on Apr 29. I'm surfacing this as a stand-alone issue to aid discussion around this.I would hate to see this die on the vine, and not make it into wider use. What needs to happen to get this over the finish line?
Description:
The goals are (at a minimum):
stan_surv
and related functions.Some options for release & installation:
feature/survival
into master, release on CRAN (and elsewhere)feature/survival
branch, hosted on mc-stan.org. [is this already being done?]Reproducible Steps:
See related issues: #500 , #566 , etc.
RStanARM Version:
2.21.3
R Version:
N/A
Operating System:
N/A
The text was updated successfully, but these errors were encountered: