Parametric bootstrap for mixed-effects models

Author

Phillip Alday, Douglas Bates, and Reinhold Kliegl

Published

2026-02-09

The speed of MixedModels.jl relative to its predecessors makes the parametric bootstrap much more computationally tractable. This is valuable because the parametric bootstrap can be used to produce more accurate confidence intervals than methods based on standard errors or profiling of the likelihood surface.

This page is adapted from the MixedModels.jl docs

1 The parametric bootstrap

Bootstrapping is a family of procedures for generating sample values of a statistic, allowing for visualization of the distribution of the statistic or for inference from this sample of values. Bootstrapping also belongs to a larger family of procedures called resampling, which are based on creating new samples of data from an existing one, then computing statistics on the new samples, in order to examine the distribution of the relevant statistics.

A parametric bootstrap is used with a parametric model, m, that has been fit to data. The procedure is to simulate n response vectors from m using the estimated parameter values and refit m to these responses in turn, accumulating the statistics of interest at each iteration.

The parameters of a LinearMixedModel object are the fixed-effects parameters (β), the standard deviation (σ), of the per-observation noise, and the covariance parameter (θ), that defines the variance-covariance matrices of the random effects. A technical description of the covariance parameter can be found in the MixedModels.jl docs. Lisa Schwetlick and Daniel Backhaus have provided a more beginner-friendly description of the covariance parameter in the documentation for MixedModelsSim.jl. For today’s purposes – looking at the uncertainty in the estimates from a fitted model – we can simply use values from the fitted model, but we will revisit the parametric bootstrap as a convenient way to simulate new data, potentially with different parameter values, for power analysis.

Attach the packages to be used

Code

using AlgebraOfGraphics
using CairoMakie
using DataFrames
using MixedModels
using MixedModelsMakie
using Random
using SMLP2026: dataset

using AlgebraOfGraphics: AlgebraOfGraphics as AoG
const progress=false

Note that the precise stream of random numbers generated for a given seed can change between Julia versions. For exact reproducibility, you either need to have the exact same Julia version or use the StableRNGs package.

2 A model of moderate complexity

The kb07 data (Kronmüller & Barr, 2007) are one of the datasets provided by the MixedModels package.

kb07 = dataset(:kb07)

Arrow.Table with 1789 rows, 7 columns, and schema:
 :subj      String
 :item      String
 :spkr      String
 :prec      String
 :load      String
 :rt_trunc  Int16
 :rt_raw    Int16

Convert the table to a DataFrame for summary.

kb07 = DataFrame(kb07)
describe(kb07)

7×7 DataFrame

Row	variable	mean	min	median	max	nmissing	eltype
	Symbol	Union…	Any	Union…	Any	Int64	DataType
1	subj		S030		S103	0	String
2	item		I01		I32	0	String
3	spkr		new		old	0	String
4	prec		break		maintain	0	String
5	load		no		yes	0	String
6	rt_trunc	2182.2	579	1940.0	5171	0	Int16
7	rt_raw	2226.24	579	1940.0	15923	0	Int16

The experimental factors; spkr, prec, and load, are two-level factors.

contrasts = Dict(:spkr => EffectsCoding(),
                 :prec => EffectsCoding(),
                 :load => EffectsCoding(),
)

The EffectsCoding contrast is used with these to create a ±1 encoding.

We can look at an initial fit of moderate complexity:

form = @formula(rt_trunc ~ 1 + spkr * prec * load +
                          (1 + spkr + prec + load | subj) +
                          (1 + spkr + prec + load | item))
m0 = fit(MixedModel, form, kb07; contrasts, progress)

	Est.	SE	z	p	σ_subj	σ_item
(Intercept)	2181.6728	77.3040	28.22	<1e-99	301.7798	362.2071
spkr: old	67.7485	18.2919	3.70	0.0002	42.9978	40.6939
prec: maintain	-333.9212	47.1520	-7.08	<1e-11	62.0047	246.8895
load: yes	78.7703	19.5373	4.03	<1e-04	65.1516	42.3617
spkr: old & prec: maintain	-21.9656	15.8060	-1.39	0.1646
spkr: old & load: yes	18.3843	15.8060	1.16	0.2448
prec: maintain & load: yes	4.5339	15.8060	0.29	0.7742
spkr: old & prec: maintain & load: yes	23.6074	15.8060	1.49	0.1353
Residual	668.4953

The default display in Quarto uses the pretty MIME show method for the model and omits the estimated correlations of the random effects.

The VarCorr extractor displays these.

VarCorr(m0)

	Column	Variance	Std.Dev	Corr.
subj	(Intercept)	91071.0419	301.7798
	spkr: old	1848.8141	42.9978	+0.78
	prec: maintain	3844.5766	62.0047	-0.59	+0.03
	load: yes	4244.7317	65.1516	+0.36	+0.83	+0.53
item	(Intercept)	131193.9807	362.2071
	spkr: old	1655.9905	40.6939	+0.44
	prec: maintain	60954.4361	246.8895	-0.69	+0.35
	load: yes	1794.5158	42.3617	+0.32	+0.16	-0.14
Residual		446885.9859	668.4953

None of the two-factor or three-factor interaction terms in the fixed-effects are significant. In the random-effects terms only the scalar random effects and the prec random effect for item appear to be warranted, leading to the reduced formula

# formula f4 from https://doi.org/10.33016/nextjournal.100002
form = @formula(rt_trunc ~ 1 + spkr * prec * load + (1 | subj) + (1 + prec | item))

m1 = fit(MixedModel, form, kb07; contrasts, progress)

	Est.	SE	z	p	σ_item	σ_subj
(Intercept)	2181.7582	77.4710	28.16	<1e-99	364.7293	298.1107
spkr: old	67.8114	16.0526	4.22	<1e-04
prec: maintain	-333.8582	47.4631	-7.03	<1e-11	252.6694
load: yes	78.6849	16.0525	4.90	<1e-06
spkr: old & prec: maintain	-21.8802	16.0525	-1.36	0.1729
spkr: old & load: yes	18.3214	16.0526	1.14	0.2537
prec: maintain & load: yes	4.4710	16.0526	0.28	0.7806
spkr: old & prec: maintain & load: yes	23.5219	16.0525	1.47	0.1428
Residual	678.9319

VarCorr(m1)

	Column	Variance	Std.Dev	Corr.
item	(Intercept)	133027.436	364.729
	prec: maintain	63841.833	252.669	-0.70
subj	(Intercept)	88870.014	298.111
Residual		460948.573	678.932

These two models are nested and can be compared with a likelihood-ratio test.

MixedModels.likelihoodratiotest(m0, m1)

	model-dof	-2 logLik	χ²	χ²-dof	P(>χ²)
rt_trunc ~ 1 + spkr + prec + load + spkr & prec + spkr & load + prec & load + spkr & prec & load + (1 + spkr + prec + load \| subj) + (1 + spkr + prec + load \| item)	29	-28637
rt_trunc ~ 1 + spkr + prec + load + spkr & prec + spkr & load + prec & load + spkr & prec & load + (1 \| subj) + (1 + prec \| item)	13	-28658	21	-16	0.1649

The p-value of approximately 20% leads us to prefer the simpler model, m1, to the more complex, m0.

3 Bootstrap basics

To bootstrap the model parameters, first initialize a random number generator then create a bootstrap sample and extract the table of parameter estimates from it.

const RNG = MersenneTwister(42)
samp = parametricbootstrap(RNG, 5_000, m1)
tbl = samp.tbl

Table with 18 columns and 5000 rows:
      obj      β1       β2       β3        β4       β5        β6        ⋯
    ┌────────────────────────────────────────────────────────────────────
 1  │ 28666.6  2173.04  55.6592  -359.246  87.1657  -35.2652  32.9777   ⋯
 2  │ 28709.1  1973.88  88.8108  -313.754  83.8083  -23.2173  -15.9147  ⋯
 3  │ 28575.8  2167.28  48.2509  -297.174  79.0748  -22.07    41.8946   ⋯
 4  │ 28721.3  2259.36  65.9821  -386.726  82.7247  -35.3908  14.3295   ⋯
 5  │ 28590.9  2182.95  45.0315  -410.477  109.293  -19.132   42.9369   ⋯
 6  │ 28518.9  2167.98  61.0523  -360.559  75.7268  15.1612   61.9125   ⋯
 7  │ 28702.9  2210.25  70.8323  -349.703  99.4985  -32.9348  16.797    ⋯
 8  │ 28726.1  2235.04  59.248   -353.738  58.0711  -35.2804  10.0765   ⋯
 9  │ 28734.7  2042.97  72.1467  -300.828  71.2183  -14.2944  10.7301   ⋯
 10 │ 28609.2  2011.1   76.404   -254.604  63.9238  -37.8758  3.95621   ⋯
 11 │ 28636.7  2166.35  71.1184  -324.401  58.5484  -21.7266  -31.4973  ⋯
 12 │ 28659.2  2158.36  64.3944  -336.451  80.3281  -39.3608  1.96649   ⋯
 13 │ 28637.3  2147.09  70.6536  -399.734  60.0463  -35.6682  28.4599   ⋯
 14 │ 28672.7  2138.23  75.4572  -262.36   96.9987  -32.0501  26.9257   ⋯
 15 │ 28662.5  2239.34  65.1886  -373.371  83.4639  -29.7632  5.43283   ⋯
 16 │ 28604.9  2088.54  62.0142  -268.106  99.684   -27.3686  49.5337   ⋯
 17 │ 28539.6  2313.47  61.1166  -337.952  74.7689  -14.5982  48.7173   ⋯
 ⋮  │    ⋮        ⋮        ⋮        ⋮         ⋮        ⋮         ⋮      ⋱

An empirical density plot of the estimates of the residual standard deviation is obtained as

plt = data(tbl) * mapping(:σ) * AoG.density()
draw(plt; axis=(;title="Parametric bootstrap estimates of σ"))

A density plot of the estimates of the standard deviation of the random effects is obtained as

plt = data(tbl) * mapping(
  [:σ1, :σ2, :σ3] .=> "Bootstrap replicates of standard deviations";
  color=dims(1) => renamer(["Item intercept", "Item speaker", "Subj"])
) * AoG.density()
draw(plt; figure=(;supertitle="Parametric bootstrap estimates of variance components"))

The bootstrap sample can be used to generate intervals that cover a certain percentage of the bootstrapped values. We refer to these as “coverage intervals”, similar to a confidence interval. The shortest such intervals, obtained with the confint extractor, correspond to a highest posterior density interval in Bayesian inference.

We generate these for all random and fixed effects:

confint(samp; method=:shortest)

DictTable with 2 columns and 13 rows:
 par   lower      upper
 ────┬─────────────────────
 β1  │ 2033.36    2332.44
 β2  │ 34.9886    98.421
 β3  │ -430.862   -246.17
 β4  │ 48.9002    112.424
 β5  │ -51.4927   12.1684
 β6  │ -13.6705   49.681
 β7  │ -27.2138   34.6244
 β8  │ -6.36897   55.7056
 ρ1  │ -0.899991  -0.460395
 σ   │ 654.51     700.692
 σ1  │ 264.116    448.735
 σ2  │ 174.044    316.133
 σ3  │ 228.265    359.429

draw(
  data(samp.β) * mapping(:β; color=:coefname) * AoG.density();
  figure=(; resolution=(800, 450)),
)

┌ Warning: Found `resolution` in the theme when creating a `Scene`. The `resolution` keyword for `Scene`s and `Figure`s has been deprecated. Use `Figure(; size = ...` or `Scene(; size = ...)` instead, which better reflects that this is a unitless size and not a pixel resolution. The key could also come from `set_theme!` calls or related theming functions.
└ @ Makie ~/.julia/packages/Makie/FUAHr/src/scenes.jl:238

For the fixed effects, MixedModelsMakie provides a convenience interface to plot the combined coverage intervals and density plots

ridgeplot(samp)

Often the intercept will be on a different scale and potentially less interesting, so we can stop it from being included in the plot:

ridgeplot(samp; show_intercept=false)

4 Singularity

Let’s consider the classic dysetuff dataset:

dyestuff = dataset(:dyestuff)
mdye = fit(MixedModel, @formula(yield ~ 1 + (1 | batch)), dyestuff)

	Est.	SE	z	p	σ_batch
(Intercept)	1527.5000	17.6946	86.33	<1e-99	37.2603
Residual	49.5101

sampdye = parametricbootstrap(MersenneTwister(1234321), 10_000, mdye)
tbldye = sampdye.tbl

Table with 5 columns and 10000 rows:
      obj      β1       σ        σ1       θ1
    ┌─────────────────────────────────────────────
 1  │ 316.738  1528.14  43.5735  22.6691  0.520249
 2  │ 336.101  1552.01  58.1535  39.5166  0.679521
 3  │ 322.046  1501.55  51.857   0.0      0.0
 4  │ 326.893  1525.24  43.6512  66.3744  1.52057
 5  │ 331.544  1522.05  50.7415  51.0074  1.00524
 6  │ 326.892  1550.23  53.5174  19.0928  0.356759
 7  │ 313.786  1504.16  45.1878  0.0      0.0
 8  │ 322.636  1494.05  49.8556  17.7733  0.356494
 9  │ 323.582  1529.74  46.4678  35.2094  0.757716
 10 │ 321.639  1532.6   45.3573  32.4904  0.71632
 11 │ 331.565  1548.25  55.8562  28.7512  0.514736
 12 │ 310.076  1481.8   34.9418  38.4381  1.10006
 13 │ 314.492  1541.95  40.3751  28.3694  0.702646
 14 │ 340.876  1530.48  63.006   42.6409  0.676776
 15 │ 309.778  1496.17  41.5411  8.08345  0.194589
 16 │ 335.375  1530.32  57.4864  38.9066  0.676797
 17 │ 318.347  1511.28  41.0926  39.1128  0.95182
 ⋮  │    ⋮        ⋮        ⋮        ⋮        ⋮

plt = data(tbldye) * mapping(:σ1) * AoG.density()
draw(plt; axis=(;title="Parametric bootstrap estimates of σ_batch"))

Notice that this density plot has a spike, or mode, at zero. Although this mode appears to be diffuse, this is an artifact of the way that density plots are created. In fact, it is a pulse, as can be seen from a histogram.

plt = data(tbldye) * mapping(:σ1) * AoG.histogram(;bins=100)
draw(plt; axis=(;title="Parametric bootstrap estimates of σ_batch"))

A value of zero for the standard deviation of the random effects is an example of a singular covariance. It is easy to detect the singularity in the case of a scalar random-effects term. However, it is not as straightforward to detect singularity in vector-valued random-effects terms.

For example, if we bootstrap a model fit to the sleepstudy data

sleepstudy = dataset(:sleepstudy)
msleep = fit(MixedModel, @formula(reaction ~ 1 + days + (1 + days | subj)),
             sleepstudy)

	Est.	SE	z	p	σ_subj
(Intercept)	251.4051	6.6323	37.91	<1e-99	23.7807
days	10.4673	1.5022	6.97	<1e-11	5.7169
Residual	25.5918

sampsleep = parametricbootstrap(MersenneTwister(666), 10_000, msleep)
tblsleep = sampsleep.tbl

Table with 10 columns and 10000 rows:
      obj      β1       β2       σ        σ1       σ2       ρ1           ⋯
    ┌─────────────────────────────────────────────────────────────────────
 1  │ 1694.07  258.109  10.7928  22.1713  13.639   5.42874  0.430468     ⋯
 2  │ 1731.59  241.267  9.77438  23.5943  23.8822  7.07983  -0.277566    ⋯
 3  │ 1706.51  254.403  10.6731  23.0742  21.4347  3.78781  0.493327     ⋯
 4  │ 1737.33  257.833  7.61741  24.5027  27.2652  4.8567   -0.0191251   ⋯
 5  │ 1764.5   244.154  13.0045  25.8322  30.9873  7.7826   -0.532294    ⋯
 6  │ 1730.85  247.672  13.2977  24.3583  18.6138  5.8777   -0.0542334   ⋯
 7  │ 1709.66  256.847  8.60195  22.4272  26.887   6.62176  -0.638       ⋯
 8  │ 1776.49  252.028  9.56227  27.3707  27.7004  6.53339  -0.353796    ⋯
 9  │ 1732.12  248.687  11.5173  23.7154  22.2868  6.69766  0.304276     ⋯
 10 │ 1748.89  249.387  10.2581  25.3961  24.7505  5.44656  0.00469702   ⋯
 11 │ 1735.34  247.16   8.57304  25.593   19.2145  3.99177  1.0          ⋯
 12 │ 1757.34  248.074  9.7696   25.9833  29.286   4.74156  0.216284     ⋯
 13 │ 1727.32  248.007  12.8942  23.9316  32.1766  3.5042   0.152094     ⋯
 14 │ 1753.38  252.684  9.88844  25.2124  21.3447  7.64868  -0.00705103  ⋯
 15 │ 1731.05  262.218  7.70813  25.0737  18.1157  4.89352  -0.280422    ⋯
 16 │ 1742.68  252.273  12.7376  24.4489  20.684   7.47373  0.0242213    ⋯
 17 │ 1727.37  243.956  12.107   24.6336  26.502   3.89655  -0.411461    ⋯
 ⋮  │    ⋮        ⋮        ⋮        ⋮        ⋮        ⋮          ⋮       ⋱

the singularity can be exhibited as a standard deviation of zero or as a correlation of ±1.

confint(sampsleep)

DictTable with 2 columns and 6 rows:
 par   lower      upper
 ────┬───────────────────
 β1  │ 238.265    264.28
 β2  │ 7.56523    13.4072
 ρ1  │ -0.409975  1.0
 σ   │ 22.6673    28.5759
 σ1  │ 10.5611    33.2576
 σ2  │ 3.08367    7.73195

A histogram of the estimated correlations from the bootstrap sample has a spike at +1.

plt = data(tblsleep) * mapping(:ρ1) * AoG.histogram(;bins=100)
draw(plt; axis=(;title="Parametric bootstrap samples of correlation of random effects"))

or, as a count,

count(tblsleep.ρ1  .≈ 1)

Close examination of the histogram shows a few values of -1.

count(tblsleep.ρ1 .≈ -1)

Furthermore there are even a few cases where the estimate of the standard deviation of the random effect for the intercept is zero.

count(tblsleep.σ1 .≈ 0)

There is a general condition to check for singularity of an estimated covariance matrix or matrices in a bootstrap sample. The parameter optimized in the estimation is θ, the relative covariance parameter. Some of the elements of this parameter vector must be non-negative and, when one of these components is approximately zero, one of the covariance matrices will be singular.

The issingular method for a MixedModel object that tests if a parameter vector θ corresponds to a boundary or singular fit.

This operation is encapsulated in a method for the issingular function that works on MixedModelBootstrap objects.

count(issingular(sampsleep))

5 References

Kronmüller, E., & Barr, D. J. (2007). Perspective-free pragmatics: Broken precedents and the recovery-from-preemption hypothesis. Journal of Memory and Language, 56(3), 436–455. https://doi.org/10.1016/j.jml.2006.05.002

This page was rendered from git revision f875caa using Quarto 1.8.26.