Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

describe_posterior() triggers model recompiling & sampling #997

Open
mattansb opened this issue Jan 27, 2025 · 13 comments
Open

describe_posterior() triggers model recompiling & sampling #997

mattansb opened this issue Jan 27, 2025 · 13 comments

Comments

@mattansb
Copy link
Member

Discussed in easystats/bayestestR#693

Originally posted by JWiley January 27, 2025
I am experiencing some behaviour that I think is unexpected, but perhaps I am missing something.
I fit some models in brms and save them. Later I read them back in and call describe_posterior(), mostly because I want to report 99% CIs which is not the default calculated in brms.

This is the code:

out$Summary <- bayestestR::describe_posterior(
  out$Results,
  centrality = c("mean", "median"), dispersion = TRUE,
  ci = CI, ci_method = CIType,
  test = c("p_direction"),
  diagnostic = c("Rhat", "ESS"),
  effects = "all", component = "all",
  parameters = "(^b_)|(^sd_)|(^cor_)|(^sds_)|(^bs_)")

The part that is unexpected to me is that this seems to trigger recompiling and resampling the model.
Which really slows it all down.

The desired updates require recompiling the model
Compiling Stan program...
Start sampling
Running MCMC with 4 parallel chains...

Chain 1 Iteration:    1 / 3000 [  0%]  (Warmup) 
Chain 2 Iteration:    1 / 3000 [  0%]  (Warmup) 
Chain 3 Iteration:    1 / 3000 [  0%]  (Warmup) 
Chain 4 Iteration:    1 / 3000 [  0%]  (Warmup) 
[snip]
Chain 1 Iteration: 2800 / 3000 [ 93%]  (Sampling) 
Chain 4 finished in 686.1 seconds.
Chain 3 Iteration: 3000 / 3000 [100%]  (Sampling) 
Chain 3 finished in 704.9 seconds.
Chain 2 Iteration: 3000 / 3000 [100%]  (Sampling) 
Chain 2 finished in 706.4 seconds.
Chain 1 Iteration: 2900 / 3000 [ 96%]  (Sampling) 
Chain 1 Iteration: 3000 / 3000 [100%]  (Sampling) 
Chain 1 finished in 727.6 seconds.

All 4 chains finished successfully.
Mean chain execution time: 706.3 seconds.
Total execution time: 727.8 seconds.

It's fairly easy to side step this, given all I want that's not already in the model results I saved is the 99% CIs and maybe some of the probability of direction, but I just expected this call to describe_posterior() to be virtually "free" computationally.
As far as I know, I'm not trying to calculate bayes factors or anything where you'd need prior draws, which is the only thing I can think of that's not already drawn and saved in the model.

Any thoughts or insight would be welcome.


I'm not entirely sure what all is needed in the model to trigger this

  • if the outcome distribution is gaussian, the recompile and resampling is not needed
  • if there is no random slope even with negative binomial distribution the recompile and resampling is not needed

so my best guess is that it is something to do with a random slope + the negative binomial distribution --- it may also be to do with correlations amongst random effects

with most models, this behaviour is not exhibited, which I think is why I thought it may be unexpected, but it also may be expected and I don't understand what is happening or needs to happen for the posterior summaries that are calculated


Reproducible example:

## analysis related
library(brms)
library(cmdstanr)
library(bayestestR)

CIType <- "HDI"
CI <- 0.99

m <- brm(
  formula = bf(
    hp ~ mpg + (1 | am / cyl) + (1 + mpg | cyl),
    shape ~ 1 + (1 | cyl)
  ), data = mtcars,
  family = negbinomial(),
  save_pars = save_pars(group = TRUE, latent = FALSE, all = TRUE),
  init = "random",
  chains = 4, iter = 3000, warmup = 1000, thin = 1,
  cores = 4,
  normalize = TRUE, algorithm = "sampling",
  seed = 1234, backend = "cmdstanr"
)

bayestestR::describe_posterior(
  m,
  centrality = c("mean", "median"), dispersion = TRUE,
  ci = 0.99, ci_method = "HDI",
  test = c("p_direction"),
  diagnostic = c("Rhat", "ESS"),
  effects = "all", component = "all",
  parameters = "(^b_)|(^sd_)|(^cor_)|(^sds_)|(^bs_)"
)
@mattansb
Copy link
Member Author

This seems to happen when describe_posterior() calls rope_range(), which calls insight::get_sigma() only when the model is from a count distribution:

https://github.com/easystats/bayestestR/blob/fb746e32b652794df90554865374ef0935c5d2ff/R/rope_range.R#L166-L171


Reprex:

library(brms)

m <- brm(
  formula = bf(
    hp ~ mpg + (1 | am / cyl) + (1 + mpg | cyl),
    shape ~ 1 + (1 | cyl)
  ), data = mtcars,
  family = negbinomial(),
  save_pars = save_pars(group = TRUE, latent = FALSE, all = TRUE),
  init = "random",
  chains = 2, iter = 1000, warmup = 500, thin = 1,
  cores = 2,
  normalize = TRUE, algorithm = "sampling",
  seed = 1234
)

insight::get_sigma(m)
#> The desired updates require recompiling the model
#> Compiling Stan program...

@mattansb
Copy link
Member Author

This is happening in insight::get_variance_residual() / insight::get_variance(), because it is refitting a completely new model with insight::null_model() over here (line 86):

if (faminfo$is_linear && !faminfo$is_tweedie) {
# we don't need these for linear models
me_info_null <- NULL
} else {
if (is.null(model_null)) {
model_null <- .safe(null_model(model, verbose = FALSE))
}
me_info_null <- .get_variance_information(
model_null,
faminfo = faminfo,
name_fun = name_fun,
verbose = verbose
)
}

Refitting a model, especially an MCMC one, can be very costly. This should probably almost never happen without the user requesting this.

@mattansb mattansb transferred this issue from easystats/bayestestR Jan 27, 2025
@strengejacke
Copy link
Member

What would you suggest, returning NULL?

@mattansb
Copy link
Member Author

I don't know enough about what's going on here, I just tracked this down. What is this function trying to do that it (sometimes?) needs an empty model?

@JWiley
Copy link

JWiley commented Jan 27, 2025

@mattansb why is rope range even getting called when: test = c("p_direction")? I would have thought that would suppress anything to do with ROPEs? Alternately for the short term, do you know of any way I can suppress rope_range getting called?

@strengejacke
Copy link
Member

I don't know enough about what's going on here, I just tracked this down. What is this function trying to do that it (sometimes?) needs an empty model?

We define the rope range depending on 0.1 * sigma for some distributions. If we can't extract sigma via stats::sigma(), and we have a mixed model, we call insight::get_variance(), which compiles the null-model.

If sigma is not available, we fall back to 1. We could do this without calling get_variance() first.

@strengejacke
Copy link
Member

For that model:

sigma(m)
#> numeric(0)

@mattansb
Copy link
Member Author

@JWiley it shouldn't. See potential fix in easystats/bayestestR#695

@mattansb
Copy link
Member Author

@strengejacke why does insight::get_variance() compile the null-model?

@JWiley
Copy link

JWiley commented Jan 27, 2025

I'm not particularly up-to-date in this space, but last I read about estimating variances for GLMMs used for effect sizes, R2, and the like, you could define R2 something like:

$$ R^2 = \frac{var_{fixed}}{var_{fixed} + var_{random} + someterminvolving~\lambda} $$

I don't recall exact equation around the term with lambda, some ratios I think. Anyways,

$$ E(\lambda) = exp(\beta_0 + var_{random}) $$

and $\beta_0$ coming from the null model --- otherwise I think maybe you can do some stuff if all predictors are centered etc. but I guess that gets harder to know / guarantee and ensuring that it's not changing based on the model but based on the data I seem to recall was some issue and so the recommendation was base it off the intercept only model.

There's probably some errors in what I've just said, but that is the gist that I recall, so that when trying to get effect sizes for models like $R^2$ with negative binomial and poisson models, an intercept only model was needed in some part of that calculation.

What I'm guessing on is that since $R^2$ relate to various variance estimates and ratios, that the same may apply when trying to get some variance estimate from a negative binomial model and that's why insight is estimating a null model.

On the plus side, that explain why it only took 10 minutes, not 6 hours like my full model when it did recompile and run. I was very confused because I thought it was running the same model and as slow as it was, it was still way faster than my models.

If I'm on track with that, some thoughts from a user perspective that would be nice:

  1. in describe_posterior() it should be possible to turn ROPE estimation to avoid this whole path
  2. in insight::get_variance() if it is going to automatically compile and sample from a null model, add a short note, at least for MCMC models. Something like "getting the variance for this model requires estimating an intercept-only model"

@strengejacke
Copy link
Member

The easy solution is to add no_recursion = TRUE to the call get_sigma(), this will not trigger calling get_variance(). That argument is not documented, it's only used internally for now.

And @JWiley is right with his explanation why we need the null model. It's required in the sub-function .variance_distributional().

@strengejacke
Copy link
Member

Can we close this, since it's fixed in bayestestR?

@mattansb
Copy link
Member Author

Let's keep this open - we should think of a better option than re-fitting a null model - especially a Bayesian null-model, since "recycling" the priors from the full model to the null model might not make any sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants