Bootstrapping a fitted model

Author

Phillip Alday, Douglas Bates, and Reinhold Kliegl

Published

2026-02-09

Begin by loading the packages to be used.

Code

using AlgebraOfGraphics
using CairoMakie
using DataFrames
using MixedModels
using Random
using SMLP2026: dataset

const progress=false

1 Data set and model

The kb07 data (Kronmüller & Barr, 2007) are one of the datasets provided by the MixedModels package.

kb07 = dataset(:kb07)

Arrow.Table with 1789 rows, 7 columns, and schema:
 :subj      String
 :item      String
 :spkr      String
 :prec      String
 :load      String
 :rt_trunc  Int16
 :rt_raw    Int16

Convert the table to a DataFrame for summary.

describe(DataFrame(kb07))

7×7 DataFrame

Row	variable	mean	min	median	max	nmissing	eltype
	Symbol	Union…	Any	Union…	Any	Int64	DataType
1	subj		S030		S103	0	String
2	item		I01		I32	0	String
3	spkr		new		old	0	String
4	prec		break		maintain	0	String
5	load		no		yes	0	String
6	rt_trunc	2182.2	579	1940.0	5171	0	Int16
7	rt_raw	2226.24	579	1940.0	15923	0	Int16

The experimental factors; spkr, prec, and load, are two-level factors. The EffectsCoding contrast is used with these to create a \(\pm1\) encoding.

contrasts = Dict{Symbol,Any}(nm => EffectsCoding() for nm in (:spkr, :prec, :load))

The display of an initial model fit

kbm01 = let
  form = @formula(
    rt_trunc ~
      1 +
      spkr * prec * load +
      (1 + spkr + prec + load | subj) +
      (1 + spkr + prec + load | item)
  )
  fit(MixedModel, form, kb07; contrasts, progress)
end

	Est.	SE	z	p	σ_subj	σ_item
(Intercept)	2181.6728	77.3040	28.22	<1e-99	301.7798	362.2071
spkr: old	67.7485	18.2919	3.70	0.0002	42.9978	40.6939
prec: maintain	-333.9212	47.1520	-7.08	<1e-11	62.0047	246.8895
load: yes	78.7703	19.5373	4.03	<1e-04	65.1516	42.3617
spkr: old & prec: maintain	-21.9656	15.8060	-1.39	0.1646
spkr: old & load: yes	18.3843	15.8060	1.16	0.2448
prec: maintain & load: yes	4.5339	15.8060	0.29	0.7742
spkr: old & prec: maintain & load: yes	23.6074	15.8060	1.49	0.1353
Residual	668.4953

does not include the estimated correlations of the random effects.

The VarCorr extractor displays these.

VarCorr(kbm01)

	Column	Variance	Std.Dev	Corr.
subj	(Intercept)	91071.0419	301.7798
	spkr: old	1848.8141	42.9978	+0.78
	prec: maintain	3844.5766	62.0047	-0.59	+0.03
	load: yes	4244.7317	65.1516	+0.36	+0.83	+0.53
item	(Intercept)	131193.9807	362.2071
	spkr: old	1655.9905	40.6939	+0.44
	prec: maintain	60954.4361	246.8895	-0.69	+0.35
	load: yes	1794.5158	42.3617	+0.32	+0.16	-0.14
Residual		446885.9859	668.4953

None of the two-factor or three-factor interaction terms in the fixed-effects are significant. In the random-effects terms only the scalar random effects and the prec random effect for item appear to be warranted, leading to the reduced formula

kbm02 = let
  form = @formula(
    rt_trunc ~
      1 + spkr + prec + load + (1 | subj) + (1 + prec | item)
  )
  fit(MixedModel, form, kb07; contrasts, progress)
end

	Est.	SE	z	p	σ_item	σ_subj
(Intercept)	2181.8526	77.4680	28.16	<1e-99	364.7121	298.0257
spkr: old	67.8790	16.0785	4.22	<1e-04
prec: maintain	-333.7906	47.4476	-7.03	<1e-11	252.5236
load: yes	78.5904	16.0785	4.89	<1e-05
Residual	680.0318

VarCorr(kbm02)

	Column	Variance	Std.Dev	Corr.
item	(Intercept)	133014.905	364.712
	prec: maintain	63768.184	252.524	-0.70
subj	(Intercept)	88819.302	298.026
Residual		462443.261	680.032

These two models are nested and can be compared with a likelihood-ratio test.

MixedModels.likelihoodratiotest(kbm02, kbm01)

	model-dof	-2 logLik	χ²	χ²-dof	P(>χ²)
rt_trunc ~ 1 + spkr + prec + load + (1 \| subj) + (1 + prec \| item)	9	-28664
rt_trunc ~ 1 + spkr + prec + load + spkr & prec + spkr & load + prec & load + spkr & prec & load + (1 + spkr + prec + load \| subj) + (1 + spkr + prec + load \| item)	29	-28637	27	20	0.1431

The p-value of approximately 17% leads us to prefer the simpler model, kbm02, to the more complex, kbm01.

2 A bootstrap sample

Create a bootstrap sample of a few thousand parameter estimates from the reduced model. The pseudo-random number generator is initialized to a fixed value for reproducibility.

Random.seed!(1234321)
kbm02samp = parametricbootstrap(2000, kbm02)
kbm02tbl = kbm02samp.tbl

Table with 14 columns and 2000 rows:
      obj      β1       β2       β3        β4       σ        σ1       σ2       ⋯
    ┌───────────────────────────────────────────────────────────────────────────
 1  │ 28772.3  2163.91  60.5609  -312.083  53.8854  702.338  372.785  244.83   ⋯
 2  │ 28715.1  2232.6   83.7997  -383.283  47.4501  693.167  391.838  214.006  ⋯
 3  │ 28575.3  2194.67  65.0581  -348.838  78.2132  660.998  380.117  241.471  ⋯
 4  │ 28728.5  2261.27  66.174   -351.407  89.6083  690.344  334.602  306.404  ⋯
 5  │ 28631.4  2309.15  77.5687  -341.306  72.5969  676.497  323.768  269.473  ⋯
 6  │ 28627.8  2091.9   54.8119  -295.505  74.487   667.146  426.801  278.072  ⋯
 7  │ 28685.7  2291.62  73.7219  -396.565  84.4817  687.627  393.807  194.634  ⋯
 8  │ 28654.4  2019.04  72.7075  -203.805  87.8736  679.825  304.365  208.357  ⋯
 9  │ 28664.1  1999.72  60.8485  -308.63   74.0233  680.412  357.133  305.426  ⋯
 10 │ 28618.3  2249.85  52.1815  -394.464  89.8091  670.656  385.055  226.33   ⋯
 11 │ 28748.9  2166.95  77.9659  -326.597  81.0316  696.842  378.969  234.48   ⋯
 12 │ 28731.4  2153.68  74.5144  -341.394  44.1759  701.902  323.252  177.605  ⋯
 13 │ 28623.9  2155.23  93.193   -373.268  73.2941  671.256  335.981  217.315  ⋯
 14 │ 28615.9  2344.1   62.8553  -314.684  88.7448  665.313  410.652  259.44   ⋯
 15 │ 28636.2  2286.5   35.9889  -349.732  62.0524  670.473  361.639  304.034  ⋯
 16 │ 28676.6  2289.43  76.9805  -298.764  87.139   686.466  316.206  232.854  ⋯
 17 │ 28614.8  2044.53  53.2336  -304.859  112.995  675.065  354.343  220.036  ⋯
 ⋮  │    ⋮        ⋮        ⋮        ⋮         ⋮        ⋮        ⋮        ⋮     ⋱

One of the uses of such a sample is to form “confidence intervals” on the parameters by obtaining the shortest interval that covers a given proportion (95%, by default) of the sample.

confint(kbm02samp)

DictTable with 2 columns and 9 rows:
 par   lower      upper
 ────┬────────────────────
 β1  │ 2022.91    2334.0
 β2  │ 34.0592    96.72
 β3  │ -430.239   -239.802
 β4  │ 46.5349    109.526
 ρ1  │ -0.907255  -0.49009
 σ   │ 657.341    702.655
 σ1  │ 270.065    451.979
 σ2  │ 181.741    325.128
 σ3  │ 233.834    364.504

A sample like this can be used for more than just creating an interval because it approximates the distribution of the estimator. For the fixed-effects parameters the estimators are close to being normally distributed, Figure 1.

Code

draw(
  data(kbm02samp.β) * mapping(:β; color=:coefname) * AlgebraOfGraphics.density();
  figure=(; resolution=(800, 450)),
)

┌ Warning: Found `resolution` in the theme when creating a `Scene`. The `resolution` keyword for `Scene`s and `Figure`s has been deprecated. Use `Figure(; size = ...` or `Scene(; size = ...)` instead, which better reflects that this is a unitless size and not a pixel resolution. The key could also come from `set_theme!` calls or related theming functions.
└ @ Makie ~/.julia/packages/Makie/FUAHr/src/scenes.jl:238

Figure 1: Comparative densities of the fixed-effects coefficients in kbm02samp

Code

let pars = ["σ1", "σ2", "σ3"]
  draw(
    data(kbm02tbl) *
    mapping(pars .=> "σ"; color=dims(1) => renamer(pars)) *
    AlgebraOfGraphics.density();
    figure=(; resolution=(800, 450)),
  )
end

┌ Warning: Found `resolution` in the theme when creating a `Scene`. The `resolution` keyword for `Scene`s and `Figure`s has been deprecated. Use `Figure(; size = ...` or `Scene(; size = ...)` instead, which better reflects that this is a unitless size and not a pixel resolution. The key could also come from `set_theme!` calls or related theming functions.
└ @ Makie ~/.julia/packages/Makie/FUAHr/src/scenes.jl:238

Figure 2: Density plot of bootstrap samples standard deviation of random effects

Code

draw(
  data(kbm02tbl) *
  mapping(:ρ1 => "Correlation") *
  AlgebraOfGraphics.density();
  figure=(; resolution=(800, 450)),
)

┌ Warning: Found `resolution` in the theme when creating a `Scene`. The `resolution` keyword for `Scene`s and `Figure`s has been deprecated. Use `Figure(; size = ...` or `Scene(; size = ...)` instead, which better reflects that this is a unitless size and not a pixel resolution. The key could also come from `set_theme!` calls or related theming functions.
└ @ Makie ~/.julia/packages/Makie/FUAHr/src/scenes.jl:238

Figure 3: Density plot of correlation parameters in bootstrap sample from model kbm02

3 References

Kronmüller, E., & Barr, D. J. (2007). Perspective-free pragmatics: Broken precedents and the recovery-from-preemption hypothesis. Journal of Memory and Language, 56(3), 436–455. https://doi.org/10.1016/j.jml.2006.05.002

This page was rendered from git revision f875caa using Quarto 1.8.26.