Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mixed vs fixed model bootstrap does not use the same type #505

Open
mattansb opened this issue May 27, 2021 · 25 comments
Open

mixed vs fixed model bootstrap does not use the same type #505

mattansb opened this issue May 27, 2021 · 25 comments
Labels
Consistency 🍏 🍎 Expected output across functions could be more similar

Comments

@mattansb
Copy link
Member

In bootstrap_model:

  • lme4::bootMer() defaults to type = "parametric".
  • boot::boot() defaults to sim = "ordinary" which is non-parametric.

Should we be consistent? Or does the diff between the models somehow justify this?

@bwiernik thoughts?

@mattansb mattansb added the Consistency 🍏 🍎 Expected output across functions could be more similar label May 27, 2021
@strengejacke
Copy link
Member

As fa as I know, there is no real "non-parametric" bootstrapping for merMod, but just "semi-parametric". Not sure if this will bring us closer to consistency, or if it is preferred to stick to the defaults?

@mattansb
Copy link
Member Author

In the docs the options at parametric of semiparametric - there is no non-parametric.

https://www.rdocumentation.org/packages/lme4/versions/1.1-27/topics/bootMer

@bwiernik
Copy link
Contributor

bwiernik commented May 27, 2021

Yeah, parametric is really the only stable option for bootMer. Semiparametric is pretty unstable; it gives weird results fairly often in my experience and is applicable to only a limited set of models. Resampling cases for mixed effects models is tricky (basically has to happen at the cluster level) and is pretty intractable for crossed models.

There is this package, but I've not investigated it https://cran.r-project.org/web/packages/lmeresampler/lmeresampler.pdf

Do we accept and pass along the use.u option to bootMer? If not, we should.

I'd also like to support other bootstrap intervals besides the percentile (eg, BCa). Opened an issue for this earlier today.

@PabloInchausti
Copy link

I might have triggered this issue with a query that I sent yesterday to a few of you (apologies for the redundancy).

I feel that there should be consistency in the call of model_parameters and perhaps a much clearer wording in the help and the vignettes.

For the "typical" sizes (a few dozens, < 100?) of datasets analyzed by scientists, I think that it does matter whether non-parametric or parametric bootstrap is used to estimate the 95%CIs and p-values. Non-parametric bootstrapping of a small sample until doomsday will never generate values on the tails of the real distribution of the response variable that are key to 95%CI and p-values. Regardless of the prob distribution of the response variable of a model, values in the upper/lower 10% are by definition unlikely to be represented in a sample of moderate size. Hence, the 95%CIs and p-values obtained by non-parametric bootstrap are bound to be somewhat biased and imprecise.

Provided that one has a statistical model with a reasonable goodness of fit to data, parametric bootstrap should almost always provide better 95%CI and p-values than non-parametric one. The reason is that generating pseudosample values in the upper/lower 10% from a prob distribution reasonably fitted by a model depends on the number of bootstrap replications (i.e. computer brute force) and not on the size of the original data.

Almost certainly, the differences in CIs and p-value between nonparametric vs parametric bootstrap disappear for large (>250?) sample sizes, but in these cases one should probably rely on the asymptotics of Wald statistics to get the CIs and p-values.

Other R libraries (e.g. glmmboot) doing bootstrap on these models employ such "weasel wording" in the description of their methods that one must query the authors to know what they actually do. This is why I feel that the excellent set of libraries of your "ecosystem" deserve a better and clerer description for the benefit of their users (like me).
Cheers
Pablo

@strengejacke
Copy link
Member

I feel that there should be consistency in the call of model_parameters and perhaps a much clearer wording in the help and the vignettes

In which way, which part could probably be improved?

@PabloInchausti
Copy link

If I might be bold for a minute, I would suggest three things;
a) Use systematically parametric bootstrap to generate 95%CI and p-values for any model (lm, glm, glmer, lmer, glmmTMB, and the other supported in your package parameters). A user can currently "trick" bootstrap_model into doing parametric bootstrap by fitting a lm, glm, glmer, lmer and others using glmmTMB because the latter encompasses all other models as "special cases". But imposing parametric boosttrap for any model should be done more elegantly by suitable programming.
b) Explain clearly in the help of bootstrap_model and bootstrap_parameters what method (parametric, non-parametric) is actually used, and why, pointing to a fuller explanation in the vignette.
c) Explain in the vignette why it is nearly always preferable to use parametric bootstrap. If the text I posted before is useful, edit and use it, or ask me to rewrite it (without any strings attached; the package and the hard work is yours, not mine). If useful for your purposes, I can send you an example of binary GLM with 200+ data points (that fits well) showing that pvalues estimated by Wald stats and paramboots may differ by orders of magnitude, and likewise that 95%CI limits by 200% at times. Nonparametric bootstrap may be even worse and is much harder to implement when data is structured by groups (as in random effects) or has a temporal or a spatial structure and these are the types of data that scientists and engineers increasingly have to analyze.

@strengejacke
Copy link
Member

hm, option "parallel" for boot() doesn't really return sensible results?

model <- lm(mpg ~ wt + factor(cyl), data = mtcars)

boot_function <- function(model, data, indices) {
  d <- mtcars[indices, ] # allows boot to select sample
  fit <- stats::update(model, data = d)

  params <- insight::get_parameters(fit)
  n_params <- insight::n_parameters(model)
  
  if (nrow(params) != n_params) {
    params <- stats::setNames(rep.int(NA, n_params), params$Parameter)
  } else {
    params <- stats::setNames(params$Estimate, params$Parameter) # Transform to named vector
  }
  
  return(params)
}

set.seed(123)
results <- boot::boot(
  data = mtcars,
  statistic = boot_function,
  R = 100,
  sim = "ordinary",
  model = model
)

as.data.frame(results$t) |> head()
#>         V1        V2        V3        V4
#> 1 33.40302 -3.247039 -3.697935 -5.852254
#> 2 35.10031 -3.802603 -3.918454 -4.430873
#> 3 33.47632 -2.927751 -4.632131 -5.792161
#> 4 33.55433 -2.763675 -4.787897 -7.392401
#> 5 29.48630 -2.436482 -2.511552 -4.973958
#> 6 34.77243 -3.755184 -3.219585 -4.911665


set.seed(123)
results <- boot::boot(
  data = mtcars,
  statistic = boot_function,
  R = 100,
  sim = "parametric",
  model = model
)

as.data.frame(results$t) |> head()
#>         V1        V2        V3       V4
#> 1 33.99079 -3.205613 -4.255582 -6.07086
#> 2 33.99079 -3.205613 -4.255582 -6.07086
#> 3 33.99079 -3.205613 -4.255582 -6.07086
#> 4 33.99079 -3.205613 -4.255582 -6.07086
#> 5 33.99079 -3.205613 -4.255582 -6.07086
#> 6 33.99079 -3.205613 -4.255582 -6.07086

Created on 2021-05-27 by the reprex package (v2.0.0)

@strengejacke
Copy link
Member

library(parameters)
model <- lm(mpg ~ wt + factor(cyl), data = mtcars)
set.seed(123)
model_parameters(model, bootstrap = TRUE, iterations = 100)
#> Parameter   | Coefficient |         95% CI |     p
#> --------------------------------------------------
#> (Intercept) |       34.21 | [29.50, 39.25] | 0.010
#> wt          |       -3.09 | [-5.02, -1.98] | 0.010
#> cyl [6]     |       -4.33 | [-6.49, -1.61] | 0.010
#> cyl [8]     |       -5.99 | [-9.17, -2.71] | 0.010

set.seed(123)
model_parameters(model, bootstrap = TRUE, iterations = 100, type = "parametric")
#> Parameter   | Coefficient |         95% CI |     p
#> --------------------------------------------------
#> (Intercept) |       33.99 | [33.99, 33.99] | 0.010
#> wt          |       -3.21 | [-3.21, -3.21] | 0.010
#> cyl [6]     |       -4.26 | [-4.26, -4.26] | 0.010
#> cyl [8]     |       -6.07 | [-6.07, -6.07] | 0.010

Created on 2021-05-27 by the reprex package (v2.0.0)

@bwiernik
Copy link
Contributor

@strengejacke I think it's a little weird that the reported point estimates are changing when bootstrap approaches are used. I think most users requesting bootstrap intervals would be expecting the maximum likelihood estimates along with the bootstrap intervals.

@PabloInchausti
Copy link

If model_parameters(model, bootstrap = TRUE, iterations = 100, type = "parametric") does parametric bootstrap for all common models (lm, glm, glmer, lmer, glmmTMB, etc), I might have provoked a storm on a tea cup. In my view, it is still strange that the user has to reach to "See 'Examples' in model_parameters.default" in the help of model_parameters to know that he/she can carry out parametric bootstrap, and that nothing in this regard is said in the help of bootstrap_model and bootstrap_parameters. If so, I should think that some editorial work in the help of the latter functions and in the vignette would solve my original query. I still stand by the relative merits of parametric vs nonparametric bootstrap from before.
If agree that users expect to see their ML estimates in the summary rather than the bootstrapped (median, mean) estimates. The curious results of the last table (estimates equal the limits of the 95%CI) can only be wrong in my view.

@strengejacke
Copy link
Member

@strengejacke I think it's a little weird that the reported point estimates are changing when bootstrap approaches are used. I think most users requesting bootstrap intervals would be expecting the maximum likelihood estimates along with the bootstrap intervals.

The more curious thing here is that the bootstrapped estimates are all the same when type = "parametric", see previous answer from me, or the CI, which bounds are the same as the point estimate.

@strengejacke
Copy link
Member

But for skewed data, isn't a bootstrapped estimate more useful? Else, one could just request robust standard errors.

@bwiernik
Copy link
Contributor

bwiernik commented May 27, 2021

Oh, yeah, that’s not right. Can you link to where you specify the parametric call to boot? You need to also give the ran.gen function (eg, simulate) and mle estimate.

@bwiernik
Copy link
Contributor

bwiernik commented May 27, 2021

My view is that if you’re going to make a parametric assumption with a single level model, you may as well construct intervals using an analytic method such as pivot or profile likelihood. These methods are tractable for such models and computationally more efficient. I don’t see all that much value in parametric bootstrap in those cases—in those cases, the value of the bootstrap is to avoid parametric assumptions when the sampling distribution isn’t likely to follow those assumptions.

@strengejacke
Copy link
Member

The code is in bootstrap_model():

bootstrap_model.default <- function(model,

@strengejacke
Copy link
Member

My view is that ...

So you would suggest we stick to the default "ordinary"?

@PabloInchausti
Copy link

I believe that if one sets type="ordinary", the command boot performs a nonparametric bootstrap by default. The parametric bootstrap option in the command boot requires providing a ran.gen function to generate new R pseudovalues of the response variable using the parameters of the originally fitted model, and then refit the model on the R pseudosamples to obtain the bootstrapped sampling distribution for each model parameter, to calculate the p-values and CI.
I think that the point of parametric bootstrap in GLM(M) type of models is to have better (less biased, more precise) p.values and CI for moderate-sized data sets for which the asymptotic Wald statistics and quadratic approximations of (multidimensional) likelihood functions do not necessarily apply, or render poor results. Even for most good fitting models, with more than 200(?; I am unsure what the threshold is since it also depends on model complexity) datapoints, we may happily live in asymptotics, but with less data as most scientists typically have, I would trust more the p-values and CI obtained after parametric bootstrap.

@bwiernik
Copy link
Contributor

This

results <- boot::boot(
    data = data,
    statistic = boot_function,
    R = iterations,
    sim = type,
    parallel = parallel,
    ncpus = n_cpus,
    model = model
)

needs to be something like:

f <- function(x, mle) {
    out <- insight::get_data(x)
    resp <- simulate(x, nsim = 1)
    out[[insight::find_response(x)]] <- resp
    return(out)
}

results <- boot::boot(
    data = data,
    statistic = boot_function,
    R = iterations,
    sim = type,
    parallel = parallel,
    ncpus = n_cpus,
    model = model,
    ran.gen = f
)

@bwiernik
Copy link
Contributor

bwiernik commented May 27, 2021

I would argue that the primary motivation for parametric bootstrap is the same as MCMC or other computational methods--to conduct inference when analytic methods are intractable (e.g., due to model complexity or convergence issues due to shallow likelihood surfaces or parameters near boundaries).

Based on my reading of literatures I'm familiar with, parametric bootstrap intervals generally have similar coverage as profile intervals. When a pivot is available (e.g., for linear regression coefficients), these usually perform better even in small samples than either profile or parametric bootstrap (this is the whole basis on fiducial inference). For a variety of problems, parametric bootstrap can have somewhat or severely erratic performance (e.g., https://www.genetics.org/content/174/1/481; https://onlinelibrary.wiley.com/doi/pdf/10.1002/sim.2514; https://www.tandfonline.com/doi/pdf/10.1080/10705511.2011.607072).

Of course, you may disagree, but in my view, the assumption that differences in deviance values are χ2 distributed becomes reasonably correct even at fairly small sample sizes (e.g., N = 20–30; some bootstrap texts jokingly suggest N = 42). In my experiences in social sciences, most analysts adopt a bootstrap approach because they wish to avoid parametric assumptions, rather than due to coverage concerns. So, I think the current default basic bootstrap approach is likely best.

@strengejacke
Copy link
Member

To summarize:

  • we keep the defaults as they are now (ordinary for non-mixed, parametric for mixed)
  • for parametric non-mixed, we add a ran.gen function (which I did here: 52f59db)

So can we close this issue?

@PabloInchausti
Copy link

PabloInchausti commented Jun 7, 2021

I would say that it is reasonable to close the issue. Many thanks for all your effort.

I think I found a minor sign inconsistency on the overdispersion parameter of the negative binomial models fitted with glmmTMB. The fitted parameter must be positive (0.566) and it comes as negative when bootstrapped. It might be useful to change the label in the bootstrap table from "(Intercept).1" to "overdispersion param." Also, when the fitted model is Zero Truncated Negative Binomial, the overdispersion parameter does not appear when bootstrapped. The latter may not be not a great loss for most users, and I just wonder if for overall consistency it might be better to just delete it from the glm.NB fitted with glmmTMB. The choice and the work is yours, of course.

library(glmmTMB)
glm.NB <- glmmTMB(count ~ mined, family=nbinom2, data=Salamanders)
summary(glm.NB)
# Family: nbinom2  ( log )
# Formula:          count ~ mined
# Data: Salamanders
#     AIC      BIC   logLik deviance df.resid 
#  1762.3   1775.7   -878.2   1756.3      641 
# Overdispersion parameter for nbinom2 family (): **0.566** 
# 
# Conditional model:
#            Estimate Std. Error z value Pr(>|z|)    
# (Intercept)  -1.2192     0.1293  -9.427   <2e-16 ***
# minedno       2.0368     0.1527  13.343   <2e-16 ***
#
bootstrap_parameters(model=glm.NB, iterations=100)
# # Fixed Effects
# 
# Parameter     | Coefficient |         95% CI |     p
# ----------------------------------------------------
# (Intercept)   |       -1.26 | [-1.54, -0.87] | 0.010
# minedno       |        2.07 | [ 1.67,  2.37] | 0.010
# (Intercept).1 |       **-0.58** | [-0.79, -0.28] | 0.010

@bwiernik
Copy link
Contributor

bwiernik commented Jun 7, 2021

@PabloInchausti Internally, the overdispersion parameter is formulated as negative by glmmTMB:

glm.NB <- glmmTMB::glmmTMB(count ~ mined, family=glmmTMB::nbinom2, data=glmmTMB::Salamanders)
#> Warning in Matrix::sparseMatrix(dims = c(0, 0), i = integer(0), j =
#> integer(0), : 'giveCsparse' has been deprecated; setting 'repr = "T"' for you

#> Warning in Matrix::sparseMatrix(dims = c(0, 0), i = integer(0), j =
#> integer(0), : 'giveCsparse' has been deprecated; setting 'repr = "T"' for you

#> Warning in Matrix::sparseMatrix(dims = c(0, 0), i = integer(0), j =
#> integer(0), : 'giveCsparse' has been deprecated; setting 'repr = "T"' for you
summary(glm.NB)
#>  Family: nbinom2  ( log )
#> Formula:          count ~ mined
#> Data: glmmTMB::Salamanders
#> 
#>      AIC      BIC   logLik deviance df.resid 
#>   1762.3   1775.7   -878.2   1756.3      641 
#> 
#> 
#> Overdispersion parameter for nbinom2 family (): 0.566 
#> 
#> Conditional model:
#>             Estimate Std. Error z value Pr(>|z|)    
#> (Intercept)  -1.2192     0.1293  -9.427   <2e-16 ***
#> minedno       2.0368     0.1527  13.343   <2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
glm.NB$fit
#> $par
#>       beta       beta      betad 
#> -1.2192401  2.0367623 -0.5696049 
#> 
#> $objective
#> [1] 878.1677
#> 
#> $convergence
#> [1] 0
#> 
#> $iterations
#> [1] 11
#> 
#> $evaluations
#> function gradient 
#>       15       12 
#> 
#> $message
#> [1] "relative convergence (4)"
#> 
#> $parfull
#>       beta       beta      betad 
#> -1.2192401  2.0367623 -0.5696049

Created on 2021-06-07 by the reprex package (v2.0.0)

(If you post more code examples, please format them using reprex::reprex()--it makes it much easier to read.)

@PabloInchausti
Copy link

The overdispersion parameter is indeed reported as negative in $par and $parfull, but not in the summary seen and interpreted by all users. glmmTMB probable changes the sign given that this parameter in the NB2 parameterization of the NegBinom distribution must be a real positive number, and perhaps your package parameters ought to do likewise.

@bwiernik
Copy link
Contributor

bwiernik commented Jun 8, 2021

Specifically, it's applying the inverse link function for the sigma model. For gaussian, that's exp(.5*betad). For Gamma, that's exp(-.5*betad). For all others, that's exp(betad).

These only apply if there is a single betad parameter, not multiple terms in the dispersion model.

@bwiernik
Copy link
Contributor

@strengejacke After we bootstrap glmmTMB models, we should split out the parameters into cond, zi, and disp lists separately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Consistency 🍏 🍎 Expected output across functions could be more similar
Projects
None yet
Development

No branches or pull requests

4 participants