-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
csSampling with multilevel models? #5
Comments
Thanks for your patience. Could you provide more information about the model and the data dimensions? We've used this to fit simple random intercept models before. Did you use the brms wrapper or a custom Stan model? If the stan model fit, then issue is the post-processing. There some data type conversions that could be inefficient for large numbers of samples/draws. I'll start looking into it, so any specifics you can provide would greatly help! |
I think the crux of the issue is here: https://discourse.mc-stan.org/t/as-matrix-for-unconstrained-parameters/11528/2 |
Thanks for looking into the issue. I created a minimal example of what I'm trying to do. The code below loads a 2-year MEPS longitudinal data file in wide-format that is converted to long. The goal of the analysis is to determine the change in
|
So I wrong. I tried using the posterior package https://mc-stan.org/posterior/articles/posterior.html but didn't see any efficiencies. A potentially worse problem was lazy use of rbind instead of pre-allocating a matrix for the parameters. The cs_sampling version in the testing branch should work faster after replacing the rbinds: https://github.com/RyanHornby/csSampling/tree/testing |
@awcm0n I apologize for the seriously long delay. Using the example code you provided and the testing-branch - on default - it ran for me in about 8 hours. The stan part was finished in a few minutes. The issue is that there are 5K+ random effects estimated. So even though the model only have 4 global parameters, cs_sampling is going to estimate and adjust all the parameters. The bottleneck is the default adjustment goes through each MCMC draw and estimates the Hessian and then averages it. This is a big matrix 5K by 5K of derivatives. The alternative is to just evaluate it at the posterior mean. They should be equivalent for large sample sizes, but the MCMC average is more stable. Changing the default let this run in about 25min for me. There are now status messages and the slowest part is step (4) where we have to invert these H and J matrices and take their eigen decomp. That's probably 80-90% of the time now. Here's the updated call. Note the use of the H_estimate argument. The default is "MCMC", anything else will use the posterior mean.
|
I'm interested in using the csSampling package to run multilevel models on complex survey data, but I didn't succeed in fitting a simple random-intercept model. After the Stan model was fit, the process stalled without error message. So my question is: Is there any guidance as to what types of models can and cannot be fit with the csSampling package?
The text was updated successfully, but these errors were encountered: