Power Analysis and Simulation Tutorial
contributed by Lisa Schwetlick and Daniel Backhaus
This tutorial demonstrates how to conduct power analyses and data simulation using Julia and the MixedModelsSim package.
Power analysis is an important tool for planning an experimental design. Here we show how to:
- Use existing data as a basis for power calculations by simulating new data.
- Adapt parameters in a given linear mixed model to analyze power without changing the existing data set.
- Create a (simple) balanced fully crossed dataset from scratch and analyze power.
- Recreate a more complex dataset from scratch and analyze power for specific model parameter but various sample sizes.
Setup
Load the packages we'll be using in Julia
First, here are the packages needed in this example.
using MixedModels # run mixed models
using MixedModelsSim # simulation utilities
using DataFrames, Tables # work with dataframes
using StableRNGs # random number generator
using Statistics # basic statistical functions
using DataFrameMacros # dplyr-like operations
using CairoMakie # plotting package
CairoMakie.activate!(type="svg") # use vector graphics
using MixedModelsMakie # some extra plotting function for MixedModels
using ProgressMeter # show progress in loops
Define number of iterations
Here we define how many model simulations we want to do. A large number will give more reliable results, but will take longer to compute. It is useful to set it to a low number for testing, and increase it for your final analysis.
# for real power analysis, set this much higher
nsims = 100
100
Use existing data to simulate new data
Build a linear mixed model from existing data
For the first example we are going to simulate data bootstrapped from an existing data set, namely Experiment 2 from
Kronmüller, E., & Barr, D. J. (2007). Perspective-free pragmatics: Broken precedents and the recovery-from-preemption hypothesis. Journal of Memory and Language, 56(3), 436-455.
This was an experiment about how in a conversation the change of a speaker or the change of precedents (which are patterns of word usage to describe an object, e.g. one can refer to the same object "white shoes", "runners", "sneakers") affects the understanding.
In experiment, objects were presented on a screen while participants listened to instructions to move the objects around. Participants' eye movements were tracked. The dependent variable is response time, defined as the latency between the onset of the test description and the moment at which the target was selected. The independent variables are speaker (old vs. new), precedents (maintain vs. break) and cognitive load (yes vs. no; from a secondary memory task).
We first load the data and define some characteristics like the contrasts and the underlying model. This dataset is one of the example datasets provided by MixedModels.jl.
Load existing data:
kb07 = MixedModels.dataset(:kb07);
Arrow.Table with 1789 rows, 7 columns, and schema:
:subj String
:item String
:spkr String
:prec String
:load String
:rt_trunc Int16
:rt_raw Int16
Set contrasts:
contrasts = Dict(:spkr => HelmertCoding(),
# set the reference level such that all the coefs
# have the same sign, which makes the plotting nicer
:prec => HelmertCoding(base="maintain"),
:load => HelmertCoding(),
# pseudo-contrast for grouping variables
:item => Grouping(),
:subj => Grouping());
Dict{Symbol, StatsModels.AbstractContrasts} with 5 entries:
:item => Grouping()
:spkr => HelmertCoding(nothing, nothing)
:load => HelmertCoding(nothing, nothing)
:prec => HelmertCoding("maintain", nothing)
:subj => Grouping()
The chosen linear mixed model (LMM) for this dataset is defined by the following model formula:
kb07_f = @formula(rt_trunc ~ 1 + spkr + prec + load + (1|subj) + (1 + prec|item));
FormulaTerm
Response:
rt_trunc(unknown)
Predictors:
1
spkr(unknown)
prec(unknown)
load(unknown)
(subj)->1 | subj
(prec,item)->(1 + prec) | item
Fit the model:
kb07_m = fit(MixedModel, kb07_f, kb07; contrasts=contrasts)
Linear mixed model fit by maximum likelihood
rt_trunc ~ 1 + spkr + prec + load + (1 | subj) + (1 + prec | item)
logLik -2 logLik AIC AICc BIC
-14331.9251 28663.8501 28681.8501 28681.9513 28731.2548
Variance components:
Column Variance Std.Dev. Corr.
item (Intercept) 133015.241 364.713
prec: break 63766.936 252.521 +0.70
subj (Intercept) 88819.437 298.026
Residual 462443.388 680.032
Number of obs: 1789; levels of grouping factors: 32, 56
Fixed-effects parameters:
───────────────────────────────────────────────────
Coef. Std. Error z Pr(>|z|)
───────────────────────────────────────────────────
(Intercept) 2181.85 77.4681 28.16 <1e-99
spkr: old 67.879 16.0785 4.22 <1e-04
prec: break 333.791 47.4472 7.03 <1e-11
load: yes 78.5904 16.0785 4.89 <1e-05
───────────────────────────────────────────────────
Simulate from existing data with same model parameters
We will first look at the power of the dataset with the same parameters as in the original data set. This means that each dataset will have the exact number of observations as the original data. Here, we use the model kb07_m
we fitted above to our dataset kb07
.
You can use the parametricbootstrap()
function to run nsims
iterations of data sampled using the parameters from kb07_m
.
The parametric bootstrap is actually a simulation procedure. Each bootstrap iteration
- simulates new data based on an existing model
- then fits a model to that data to obtain new estimates.
Set up a random seed to make the simulation reproducible. You can use your favourite number.
To use multithreading, you need to set the number of worker threads you want to use. In VS Code, open the settings (gear icon in the lower left corner) and search for "thread". Set julia.NumThreads
to the number of threads you want to use (at least 1 less than the total number of processor cores available, so that you can continue watching YouTube while the simulation runs).
Set random seed for reproducibility:
rng = StableRNG(42);
StableRNGs.LehmerRNG(state=0x00000000000000000000000000000055)
Run nsims iterations:
kb07_sim = parametricbootstrap(rng, nsims, kb07_m; use_threads = false);
MixedModelBootstrap with 100 samples
parameter min q25 median mean q75 max
┌───────────────────────────────────────────────────────────────────────
1 │ β1 1972.91 2120.95 2174.44 2177.43 2223.65 2440.67
2 │ β2 29.0592 56.4117 69.0202 67.744 77.2938 120.402
3 │ β3 219.728 296.828 335.884 331.123 366.562 428.224
4 │ β4 52.5324 67.1774 79.775 79.5486 88.9861 127.721
5 │ σ 642.221 670.691 679.78 679.179 689.241 707.028
6 │ σ1 238.635 325.211 353.375 353.592 385.475 455.22
7 │ σ2 159.904 231.306 252.174 250.285 270.406 325.66
8 │ ρ1 0.210796 0.647399 0.726712 0.701727 0.776084 0.915608
9 │ σ3 212.887 271.696 292.297 294.669 320.797 369.09
10 │ θ1 0.35074 0.479788 0.523039 0.520726 0.563495 0.692626
11 │ θ2 0.0743114 0.221852 0.262455 0.260232 0.296547 0.378759
12 │ θ3 0.147011 0.222005 0.253307 0.254759 0.283882 0.376143
13 │ θ4 0.319389 0.403036 0.431072 0.433986 0.472289 0.551725
Try: Run the code above with or without use_threads = true
. Did performance get better or worse? Check out the help information parametricbootstrap
to see why!
The returned value kb07_sim
contains the results of the bootstrapping procedure, which we can convert to a dataframe
df = DataFrame(kb07_sim.allpars);
first(df, 12)
Row | iter | type | group | names | value |
---|---|---|---|---|---|
Int64 | String | String? | String? | Float64 | |
1 | 1 | β | missing | (Intercept) | 2284.56 |
2 | 1 | β | missing | spkr: old | 29.0592 |
3 | 1 | β | missing | prec: break | 379.985 |
4 | 1 | β | missing | load: yes | 53.2007 |
5 | 1 | σ | item | (Intercept) | 262.512 |
6 | 1 | σ | item | prec: break | 258.77 |
7 | 1 | ρ | item | (Intercept), prec: break | 0.210796 |
8 | 1 | σ | subj | (Intercept) | 291.503 |
9 | 1 | σ | residual | missing | 672.498 |
10 | 2 | β | missing | (Intercept) | 2123.42 |
11 | 2 | β | missing | spkr: old | 64.8504 |
12 | 2 | β | missing | prec: break | 289.896 |
The dataframe df
has 4500 rows: 9 parameters, each from 500 iterations.
nrow(df)
900
We can now plot some bootstrapped parameters:
fig = Figure()
σres = @subset(df, :type == "σ" && :group == "residual")
ax = Axis(fig[1,1:2]; xlabel = "residual standard deviation", ylabel = "Density")
density!(ax, σres.value)
βInt = @subset(df, :type == "β" && :names == "(Intercept)")
ax = Axis(fig[1,3]; xlabel = "fixed effect for intercept")
density!(ax, βInt.value)
βSpeaker = @subset(df, :type == "β" && :names == "spkr: old")
ax = Axis(fig[2,1]; xlabel = "fixed effect for spkr: old", ylabel = "Density")
density!(ax, βSpeaker.value)
βPrecedents = @subset(df, :type == "β" && :names == "prec: break")
ax = Axis(fig[2,2]; xlabel = "fixed effect for prec: break")
density!(ax, βPrecedents.value)
βLoad = @subset(df, :type == "β" && :names == "load: yes")
ax = Axis(fig[2,3]; xlabel = "fixed effect for load: yes")
density!(ax, βLoad.value)
Label(fig[0,:]; text = "Parametric bootstrap replicates by parameter", fontsize=25)
fig
For the fixed effects, we can do this more succinctly via the ridgeplot functionality in MixedModelsMakie, even optionally omitting the intercept (which we often don't care about).
ridgeplot(kb07_sim; show_intercept=false)
Next, we extract the p-values of the fixed-effects parameters into a dataframe
kb07_sim_df = DataFrame(kb07_sim.coefpvalues);
first(kb07_sim_df, 8)
Row | iter | coefname | β | se | z | p |
---|---|---|---|---|---|---|
Int64 | Symbol | Float64 | Float64 | Float64 | Float64 | |
1 | 1 | (Intercept) | 2284.56 | 62.6397 | 36.4714 | 3.15013e-291 |
2 | 1 | spkr: old | 29.0592 | 15.9004 | 1.82757 | 0.067614 |
3 | 1 | prec: break | 379.985 | 48.4292 | 7.84618 | 4.28886e-15 |
4 | 1 | load: yes | 53.2007 | 15.9004 | 3.34587 | 0.000820244 |
5 | 2 | (Intercept) | 2123.42 | 86.3467 | 24.5918 | 1.54533e-133 |
6 | 2 | spkr: old | 64.8504 | 16.2805 | 3.98333 | 6.7957e-5 |
7 | 2 | prec: break | 289.896 | 56.5224 | 5.12887 | 2.91494e-7 |
8 | 2 | load: yes | 83.4592 | 16.2804 | 5.12635 | 2.95417e-7 |
Now that we have a bootstrapped data, we can start our power calculation.
Power calculation
The function power_table()
from MixedModelsSim
takes the output of parametricbootstrap()
and calculates the proportion of simulations where the p-value is less than alpha for each coefficient. You can set the alpha
argument to change the default value of 0.05 (justify your alpha).
ptbl = power_table(kb07_sim, 0.05)
4-element Vector{@NamedTuple{coefname::String, power::Float64}}:
(coefname = "prec: break", power = 1.0)
(coefname = "(Intercept)", power = 1.0)
(coefname = "spkr: old", power = 0.99)
(coefname = "load: yes", power = 1.0)
An estimated power of 1 means that in every iteration the specific parameter we are looking at was below our alpha. An estimated power of 0.5 means that in half of our iterations the specific parameter we are looking at was below our alpha. An estimated power of 0 means that for none of our iterations the specific parameter we are looking at was below our alpha.
You can also do it manually:
prec_p = kb07_sim_df[kb07_sim_df.coefname .== Symbol("prec: break"),:p]
mean(prec_p .< 0.05)
1.0
For a nicer display, you can use pretty_table
:
pretty_table(ptbl)
┌─────────────┬─────────┐
│ coefname │ power │
│ String │ Float64 │
├─────────────┼─────────┤
│ prec: break │ 1.0 │
│ (Intercept) │ 1.0 │
│ spkr: old │ 0.99 │
│ load: yes │ 1.0 │
└─────────────┴─────────┘
The simulation so far should not be interpreted as the power of the original K&B experiment. Observed power calculations are generally problematic. Instead, the example so far should serve to show how power can be computed using a simulation procedure.
In the next section, we show how to use previously observed data – such as pilot data – as the basis for a simulation study with a different effect size.
Adapt parameters in a given linear mixed model to analyze power without generating additional data
Let's say we want to check our power to detect effects of spkr, prec, and load that are only half the size as in our pilot data. We can set a new vector of beta values (fixed effects) with the β
argument to parametricbootstrap()
.
Specify β:
new_beta = kb07_m.β
new_beta[2:4] = kb07_m.β[2:4] / 2
3-element Vector{Float64}:
33.93951082097325
166.89531060759816
39.29522053456475
Run simulations:
kb07_sim_half = parametricbootstrap(StableRNG(42), nsims, kb07_m; β = new_beta, use_threads = false);
MixedModelBootstrap with 100 samples
parameter min q25 median mean q75 max
┌───────────────────────────────────────────────────────────────────────
1 │ β1 1972.91 2120.95 2174.44 2177.43 2223.65 2440.67
2 │ β2 -4.88035 22.4722 35.0807 33.8044 43.3542 86.4629
3 │ β3 52.8326 129.933 168.989 164.228 199.667 261.328
4 │ β4 13.2372 27.8822 40.4798 40.2533 49.6909 88.4257
5 │ σ 642.221 670.691 679.78 679.179 689.241 707.028
6 │ σ1 238.635 325.211 353.375 353.592 385.475 455.208
7 │ σ2 159.904 231.306 252.174 250.285 270.406 325.66
8 │ ρ1 0.210796 0.647399 0.726712 0.701727 0.776084 0.915608
9 │ σ3 212.887 271.696 292.297 294.669 320.797 369.09
10 │ θ1 0.35074 0.479788 0.523039 0.520726 0.563495 0.692607
11 │ θ2 0.0743114 0.221852 0.262455 0.260231 0.296547 0.378759
12 │ θ3 0.147011 0.222005 0.253307 0.254759 0.283882 0.376143
13 │ θ4 0.319389 0.403036 0.431072 0.433986 0.472289 0.551725
Power calculation
power_table(kb07_sim_half)
4-element Vector{@NamedTuple{coefname::String, power::Float64}}:
(coefname = "prec: break", power = 0.93)
(coefname = "(Intercept)", power = 1.0)
(coefname = "spkr: old", power = 0.54)
(coefname = "load: yes", power = 0.73)
Create and analyze a (simple) balanced fully crossed dataset from scratch
In some situations, instead of using an existing dataset it may be useful to simulate the data from scratch. This could be the case when pilot data or data from a previous study are not available. Note that we still have to assume (or perhaps guess) a particular effect size, which can be derived from previous work (whether experimental or practical). In the worst case, we can start from the smallest effect size that we would find interesting or meaningful.
In order to simulate data from scratch, we have to:
- specify the effect sizes manually
- manually create an experimental design, according to which data can be simulated
If we simulate data from scratch, we can manipulate the arguments β
, σ
and θ
(in addition to the number of subjects and items) Lets have a closer look at them, define their meaning and we will see where the corresponding values in the model output are.
Fixed Effects (βs)
β
are our effect sizes.
If we look again on our LMM summary from the kb07
-dataset kb07_m
we see our four β
under fixed-effects parameters in the Coef.
-column.
kb07_m
Linear mixed model fit by maximum likelihood
rt_trunc ~ 1 + spkr + prec + load + (1 | subj) + (1 + prec | item)
logLik -2 logLik AIC AICc BIC
-14331.9251 28663.8501 28681.8501 28681.9513 28731.2548
Variance components:
Column Variance Std.Dev. Corr.
item (Intercept) 133015.241 364.713
prec: break 63766.936 252.521 +0.70
subj (Intercept) 88819.437 298.026
Residual 462443.388 680.032
Number of obs: 1789; levels of grouping factors: 32, 56
Fixed-effects parameters:
───────────────────────────────────────────────────
Coef. Std. Error z Pr(>|z|)
───────────────────────────────────────────────────
(Intercept) 2181.85 77.4681 28.16 <1e-99
spkr: old 67.879 16.0785 4.22 <1e-04
prec: break 333.791 47.4472 7.03 <1e-11
load: yes 78.5904 16.0785 4.89 <1e-05
───────────────────────────────────────────────────
kb07_m.β
4-element Vector{Float64}:
2181.852639288019
67.8790216419465
333.7906212151963
78.5904410691295
(These can also be accessed with the appropriately named coef
function.)
Residual Variance (σ)
σ
is the residual
-standard deviation listed under the variance components.
kb07_m.σ
680.0319021648276
Random Effects (θ)
The meaning of θ
is a bit less intuitive. In a less complex model (one that only has intercepts for the random effects) or if we suppress the correlations in the formula with zerocorr()
then θ
describes the relationship between the standard deviation of the random effects and the residual standard deviation.
In our kb07_m
example:
- The
residual
standard deviation is680.032
. - The standard deviation of our first variance component
item (Intercept)
is364.713
. - Thus our first
θ
is the relationship: variance component divided byresidual
standard deviation: $364.713 / 680.032 = 0.53631$
kb07_m.θ
4-element Vector{Float64}:
0.5363168168365643
0.25981992537057197
0.26530160511766143
0.43825282248191233
We also can calculate the θ
for variance component subj (Intercept)
. The residual
standard deviation is 680.032
. The standard deviation of our variance component subj (Intercept)
is 298.026
. Thus, the related θ is the relationship: variance component divided by residual
standard deviation 298.026 / 680.032 = 0.438252
We can not calculate the θ
s for variance component item prec: break
this way, because it includes the correlation of item prec: break
and item (Intercept)
. But keep in mind that the relation of item prec: break
-variability (252.521
) and the residual
-variability (680.032
) is $252.521 / 680.032 = 0.3713369$.
The θ
vector is the flattened version of the lower Cholesky factor variance-covariance matrix. The Cholesky factor is in some sense a "matrix square root" (so like storing standard deviations instead of variances) and is a lower triangular matrix. The on-diagonal elements are just the standard deviations (the σ
's). If all off-diagonal elements are zero, we can use our calculation above. The off-diagonal elements are covariances and correspond to the correlations (the ρ
's). If they are unequal to zero, as it is in our kb07
-dataset, one way to get the two missing θ
-values is to take the values directly from the model we have already fitted.
See the two inner values:
kb07_m.θ
4-element Vector{Float64}:
0.5363168168365643
0.25981992537057197
0.26530160511766143
0.43825282248191233
Another way is to make use of the create_re()
function. Here you have to define the relation of all random effects variabilities to the variability of the residuals, as shown above, and the correlation-matrices.
Let's start by defining the correlation matrix for the item
-part.
The diagonal is always 1.0
because everything is perfectly correlated with itself. The elements below the diagonal follow the same form as the Corr.
entries in the output of VarCorr()
. In our example the correlation of item prec: break
and item (Intercept)
is -0.7
. The elements above the diagonal are just a mirror image.
re_item_corr = [1.0 -0.7; -0.7 1.0]
2×2 Matrix{Float64}:
1.0 -0.7
-0.7 1.0
Now we put together all relations of standard deviations and the correlation-matrix for the item
-group. This calculates the covariance factorization which is the theta matrix.
re_item = create_re(0.536, 0.371; corrmat = re_item_corr)
2×2 LinearAlgebra.LowerTriangular{Float64, Matrix{Float64}}:
0.536 ⋅
-0.2597 0.264947
Don't be too specific with your values in create_re()
. Generally we don't these values very precisely because estimating them precisely requires large amounts of data. Additionally, if there are numerical problems (rounding errors), you will get the error-message: PosDefException: matrix is not Hermitian; Cholesky factorization failed.
Although we advise against using these values, you can extract the exact values like so (just as a pedagogical demonstration):
corr_exact = VarCorr(kb07_m).σρ.item.ρ[1]
σ_residuals_exact = kb07_m.σ
σ_1_exact = VarCorr(kb07_m).σρ.item.σ[1] / σ_residuals_exact
σ_2_exact = VarCorr(kb07_m).σρ.item.σ[2] / σ_residuals_exact
re_item_corr = [1.0 corr_exact; corr_exact 1.0]
re_item = create_re(σ_1_exact, σ_2_exact; corrmat = re_item_corr)
2×2 LinearAlgebra.LowerTriangular{Float64, Matrix{Float64}}:
0.536317 ⋅
0.25982 0.265302
Let's continue with the subj
-part.
Since there the by-subject random effects have only one entry (the intercept), there are no correlations to specify and we can omit the corrmat
argument.
Now we put together all relations of standard deviations and the correlation-matrix for the subj
-group:
This calculates the covariance factorization which is the theta matrix.
re_subj = create_re(0.438)
1×1 LinearAlgebra.LowerTriangular{Float64, LinearAlgebra.Diagonal{Float64, Vector{Float64}}}:
0.438
If you want the exact value you can use
σ_residuals_exact = kb07_m.σ
σ_3_exact = VarCorr(kb07_m).σρ.subj.σ[1] / σ_residuals_exact
re_subj = create_re(σ_3_exact)
1×1 LinearAlgebra.LowerTriangular{Float64, LinearAlgebra.Diagonal{Float64, Vector{Float64}}}:
0.43825282248191233
As mentioned above θ
is the compact form of these covariance matrices:
createθ(kb07_m; item=re_item, subj=re_subj)
4-element Vector{Float64}:
0.5363168168365643
0.2598199253705719
0.26530160511766143
0.43825282248191233
The function createθ
is putting the random effects into the correct order and then putting them into the compact form. Even for the same formula, the order may vary between datasets (and hence models fit to those datasets) because of a particular computational trick used in MixedModels.jl. Because of this trick, we need to specify the model along with the random effects so that the correct order can be determined.
We can install these parameter in the parametricbootstrap()
-function or in the model like this:
# need the fully qualified name here because Makie also defines update!
MixedModelsSim.update!(kb07_m, item=re_item, subj=re_subj)
Linear mixed model fit by maximum likelihood
rt_trunc ~ 1 + spkr + prec + load + (1 | subj) + (1 + prec | item)
logLik -2 logLik AIC AICc BIC
-14331.9251 28663.8501 28681.8501 28681.9513 28731.2548
Variance components:
Column Variance Std.Dev. Corr.
item (Intercept) 133015.241 364.713
prec: break 63766.936 252.521 +0.70
subj (Intercept) 88819.437 298.026
Residual 462443.388 680.032
Number of obs: 1789; levels of grouping factors: 32, 56
Fixed-effects parameters:
───────────────────────────────────────────────────
Coef. Std. Error z Pr(>|z|)
───────────────────────────────────────────────────
(Intercept) 2181.85 77.4681 28.16 <1e-99
spkr: old 67.879 16.0785 4.22 <1e-04
prec: break 333.791 47.4472 7.03 <1e-11
load: yes 78.5904 16.0785 4.89 <1e-05
───────────────────────────────────────────────────
A simple example from scratch
Having this knowledge about the parameters we can now simulate data from scratch
The simdat_crossed()
function from MixedModelsSim
lets you set up a data frame with a specified experimental design. For now, it only makes fully balanced crossed designs, but you can generate an unbalanced design by simulating data for the largest cell and deleting extra rows.
First, we will set an easy design where subj_n
subjects per age
group (O
ld or Y
oung) respond to item_n
items in each of two condition
s (A or B).
Your factors need to be specified separately for between-subject, between-item, and within-subject/item factors using Dict
with the name of each factor as the keys and vectors with the names of the levels as values.
We start with the between subject factors:
subj_btwn = Dict(:age => ["O", "Y"])
Dict{Symbol, Vector{String}} with 1 entry:
:age => ["O", "Y"]
There are no between-item factors in this design so you can omit it or set it to nothing. Note that if you have factors which are between subject and between item, you need to put them in both dicts.
item_btwn = nothing
Next, we put within-subject/item factors in a dict:
both_win = Dict(:condition => ["A", "B"])
Dict{Symbol, Vector{String}} with 1 entry:
:condition => ["A", "B"]
Define subject and item number:
subj_n = 10
item_n = 30
30
Simulate data:
dat = simdat_crossed(subj_n,
item_n,
subj_btwn = subj_btwn,
item_btwn = item_btwn,
both_win = both_win);
600-element Vector{@NamedTuple{subj::String, age::String, item::String, condition::String, dv::Float64}}:
(subj = "S01", age = "O", item = "I01", condition = "A", dv = -0.24215291846520284)
(subj = "S02", age = "Y", item = "I01", condition = "A", dv = 0.7225005470580272)
(subj = "S03", age = "O", item = "I01", condition = "A", dv = -2.0069936030000735)
(subj = "S04", age = "Y", item = "I01", condition = "A", dv = -0.3427746997906516)
(subj = "S05", age = "O", item = "I01", condition = "A", dv = -0.3193794034761923)
(subj = "S06", age = "Y", item = "I01", condition = "A", dv = -1.6500819398849065)
(subj = "S07", age = "O", item = "I01", condition = "A", dv = 0.49684616924627895)
(subj = "S08", age = "Y", item = "I01", condition = "A", dv = 1.9033291222989595)
(subj = "S09", age = "O", item = "I01", condition = "A", dv = 0.5065794597251123)
(subj = "S10", age = "Y", item = "I01", condition = "A", dv = -1.5119523303151654)
⋮
(subj = "S02", age = "Y", item = "I30", condition = "B", dv = 0.026687857921569556)
(subj = "S03", age = "O", item = "I30", condition = "B", dv = 0.3411636337763573)
(subj = "S04", age = "Y", item = "I30", condition = "B", dv = 0.5989676794898203)
(subj = "S05", age = "O", item = "I30", condition = "B", dv = -0.5269773634173776)
(subj = "S06", age = "Y", item = "I30", condition = "B", dv = 0.19516400328730782)
(subj = "S07", age = "O", item = "I30", condition = "B", dv = -0.7788201226232759)
(subj = "S08", age = "Y", item = "I30", condition = "B", dv = 0.8857669490659474)
(subj = "S09", age = "O", item = "I30", condition = "B", dv = 1.3305398408572155)
(subj = "S10", age = "Y", item = "I30", condition = "B", dv = -0.49086325293426303)
Have a look:
first(DataFrame(dat),8)
Row | subj | age | item | condition | dv |
---|---|---|---|---|---|
String | String | String | String | Float64 | |
1 | S01 | O | I01 | A | -0.242153 |
2 | S02 | Y | I01 | A | 0.722501 |
3 | S03 | O | I01 | A | -2.00699 |
4 | S04 | Y | I01 | A | -0.342775 |
5 | S05 | O | I01 | A | -0.319379 |
6 | S06 | Y | I01 | A | -1.65008 |
7 | S07 | O | I01 | A | 0.496846 |
8 | S08 | Y | I01 | A | 1.90333 |
The values we see in the column dv
is just random noise (drawn from the standard normal distribution)
Set contrasts:
contrasts = Dict(:age => HelmertCoding(),
:condition => HelmertCoding(),
:subj => Grouping(),
:item => Grouping());
Dict{Symbol, StatsModels.AbstractContrasts} with 4 entries:
:item => Grouping()
:age => HelmertCoding(nothing, nothing)
:condition => HelmertCoding(nothing, nothing)
:subj => Grouping()
Define formula:
f1 = @formula(dv ~ 1 + age * condition + (1|item) + (1|subj));
FormulaTerm
Response:
dv(unknown)
Predictors:
1
age(unknown)
condition(unknown)
age(unknown) & condition(unknown)
(item)->1 | item
(subj)->1 | subj
Note that we did not include condition as random slopes for item and subject. This is mainly to keep the example simple and to keep the parameter θ
easier to understand (see Section 3 above for the explanation of θ).
Fit the model:
m1 = fit(MixedModel, f1, dat; contrasts=contrasts)
Linear mixed model fit by maximum likelihood
dv ~ 1 + age + condition + age & condition + (1 | item) + (1 | subj)
logLik -2 logLik AIC AICc BIC
-843.5849 1687.1699 1701.1699 1701.3591 1731.9484
Variance components:
Column Variance Std.Dev.
item (Intercept) 0.0000000 0.0000000
subj (Intercept) 0.0028776 0.0536435
Residual 0.9717536 0.9857756
Number of obs: 600; levels of grouping factors: 30, 10
Fixed-effects parameters:
────────────────────────────────────────────────────────────
Coef. Std. Error z Pr(>|z|)
────────────────────────────────────────────────────────────
(Intercept) 0.0056789 0.0436732 0.13 0.8965
age: Y 0.0696404 0.0436732 1.59 0.1108
condition: B 0.057254 0.0402441 1.42 0.1548
age: Y & condition: B 0.0104384 0.0402441 0.26 0.7953
────────────────────────────────────────────────────────────
Because the dv
is just random noise from N(0,1), there will be basically no subject or item random variance, residual variance will be near 1.0, and the estimates for all effects should be small. Don't worry, we'll specify fixed and random effects directly in parametricbootstrap()
.
Set random seed for reproducibility:
rng = StableRNG(42);
StableRNGs.LehmerRNG(state=0x00000000000000000000000000000055)
Specify β
, σ
, and θ
, we just made up this parameter:
new_beta = [0., 0.25, 0.25, 0.]
new_sigma = 2.0
new_theta = [1.0, 1.0]
2-element Vector{Float64}:
1.0
1.0
Run nsims iterations:
sim1 = parametricbootstrap(rng, nsims, m1;
β = new_beta,
σ = new_sigma,
θ = new_theta,
use_threads = false);
MixedModelBootstrap with 100 samples
parameter min q25 median mean q75 ⋯
┌───────────────────────────────────────────────────────────────────────
1 │ β1 -1.91722 -0.327597 0.210513 0.125857 0.583837 ⋯
2 │ β2 -1.35913 -0.158688 0.361579 0.352458 0.888642 ⋯
3 │ β3 0.0618462 0.187246 0.251725 0.248267 0.303721 ⋯
4 │ β4 -0.22035 -0.0530245 -0.0126518 -0.00511795 0.051643 ⋯
5 │ σ 1.81734 1.95216 2.00679 1.9932 2.03363 ⋯
6 │ σ1 1.29715 1.82294 1.99914 2.00614 2.22138 ⋯
7 │ σ2 1.01407 1.55278 1.83232 1.83316 2.09607 ⋯
8 │ θ1 0.633916 0.901324 1.01477 1.00645 1.11295 ⋯
9 │ θ2 0.475464 0.780095 0.898011 0.920961 1.05958 ⋯
Power calculation
ptbl= power_table(sim1)
4-element Vector{@NamedTuple{coefname::String, power::Float64}}:
(coefname = "age: Y", power = 0.23)
(coefname = "condition: B", power = 0.86)
(coefname = "(Intercept)", power = 0.08)
(coefname = "age: Y & condition: B", power = 0.04)
For nicely displaying it, you can use pretty_table:
pretty_table(ptbl)
┌───────────────────────┬─────────┐
│ coefname │ power │
│ String │ Float64 │
├───────────────────────┼─────────┤
│ age: Y │ 0.23 │
│ condition: B │ 0.86 │
│ (Intercept) │ 0.08 │
│ age: Y & condition: B │ 0.04 │
└───────────────────────┴─────────┘
Compute power curves for a more complex dataset
Recreate the kb07
-dataset from scratch
For full control over all parameters in our kb07
data set we will recreate the design using the method shown above.
Define subject and item number:
subj_n = 56
item_n = 32
32
Define factors in a dict:
subj_btwn = nothing
item_btwn = nothing
both_win = Dict(:spkr => ["old", "new"],
:prec => ["maintain", "break"],
:load => ["yes", "no"]);
Dict{Symbol, Vector{String}} with 3 entries:
:spkr => ["old", "new"]
:load => ["yes", "no"]
:prec => ["maintain", "break"]
Try: Play with simdat_crossed
.
Simulate data:
fake_kb07 = simdat_crossed(subj_n, item_n,
subj_btwn = subj_btwn,
item_btwn = item_btwn,
both_win = both_win);
14336-element Vector{@NamedTuple{subj::String, item::String, spkr::String, load::String, prec::String, dv::Float64}}:
(subj = "S01", item = "I01", spkr = "old", load = "yes", prec = "maintain", dv = 1.1388531286517134)
(subj = "S02", item = "I01", spkr = "old", load = "yes", prec = "maintain", dv = -0.22890292648148877)
(subj = "S03", item = "I01", spkr = "old", load = "yes", prec = "maintain", dv = 1.1600260147134884)
(subj = "S04", item = "I01", spkr = "old", load = "yes", prec = "maintain", dv = 1.6447046517680908)
(subj = "S05", item = "I01", spkr = "old", load = "yes", prec = "maintain", dv = 0.9591158963071323)
(subj = "S06", item = "I01", spkr = "old", load = "yes", prec = "maintain", dv = 0.31497188612035754)
(subj = "S07", item = "I01", spkr = "old", load = "yes", prec = "maintain", dv = -0.9805466866662579)
(subj = "S08", item = "I01", spkr = "old", load = "yes", prec = "maintain", dv = 0.5103988119580477)
(subj = "S09", item = "I01", spkr = "old", load = "yes", prec = "maintain", dv = 0.3333984496060959)
(subj = "S10", item = "I01", spkr = "old", load = "yes", prec = "maintain", dv = -1.11802262960405)
⋮
(subj = "S48", item = "I32", spkr = "new", load = "no", prec = "break", dv = 1.0391190512530417)
(subj = "S49", item = "I32", spkr = "new", load = "no", prec = "break", dv = 0.5673552371930647)
(subj = "S50", item = "I32", spkr = "new", load = "no", prec = "break", dv = -1.4822014421465957)
(subj = "S51", item = "I32", spkr = "new", load = "no", prec = "break", dv = -0.9584950297204301)
(subj = "S52", item = "I32", spkr = "new", load = "no", prec = "break", dv = -1.4630057361022544)
(subj = "S53", item = "I32", spkr = "new", load = "no", prec = "break", dv = -0.4871861034652149)
(subj = "S54", item = "I32", spkr = "new", load = "no", prec = "break", dv = 1.7118711066127348)
(subj = "S55", item = "I32", spkr = "new", load = "no", prec = "break", dv = -1.179094206470141)
(subj = "S56", item = "I32", spkr = "new", load = "no", prec = "break", dv = 1.8813741628525904)
Make a dataframe:
fake_kb07_df = DataFrame(fake_kb07);
Have a look:
first(fake_kb07_df,8)
Row | subj | item | spkr | load | prec | dv |
---|---|---|---|---|---|---|
String | String | String | String | String | Float64 | |
1 | S01 | I01 | old | yes | maintain | 1.13885 |
2 | S02 | I01 | old | yes | maintain | -0.228903 |
3 | S03 | I01 | old | yes | maintain | 1.16003 |
4 | S04 | I01 | old | yes | maintain | 1.6447 |
5 | S05 | I01 | old | yes | maintain | 0.959116 |
6 | S06 | I01 | old | yes | maintain | 0.314972 |
7 | S07 | I01 | old | yes | maintain | -0.980547 |
8 | S08 | I01 | old | yes | maintain | 0.510399 |
The function simdat_crossed
generates a balanced fully crossed design. Unfortunately, our original design is not balanced fully crossed. Every subject saw an image only once, thus in one of eight possible conditions. To simulate that we only keep one of every eight lines.
We sort the dataframe to enable easier selection
sort!(fake_kb07_df, [:subj, :item, :load, :prec, :spkr])
In order to select only the relevant rows of the data set we define an index which represents a random choice of one of every eight rows. First we generate a vector idx
which represents which row to keep in each set of 8.
len = div(length(fake_kb07), 8) # integer division
idx = rand(rng, 1:8 , len)
show(idx)
[8, 7, 7, 6, 1, 8, 7, 3, 1, 6, 4, 6, 2, 3, 5, 8, 3, 1, 2, 7, 2, 2, 3, 2, 5, 7, 3, 6, 1, 6, 2, 6, 6, 2, 6, 3, 3, 7, 6, 3, 1, 1, 6, 2, 4, 6, 3, 4, 8, 3, 5, 5, 5, 6, 4, 6, 8, 6, 8, 6, 6, 7, 3, 1, 2, 4, 4, 2, 7, 2, 3, 4, 2, 7, 5, 8, 3, 6, 4, 3, 6, 1, 5, 8, 4, 7, 5, 1, 8, 4, 7, 4, 2, 4, 6, 7, 2, 3, 2, 6, 8, 1, 4, 6, 8, 8, 5, 7, 2, 2, 2, 6, 8, 8, 2, 8, 5, 1, 3, 1, 3, 7, 3, 1, 7, 6, 5, 1, 1, 4, 3, 3, 8, 2, 7, 1, 6, 6, 3, 5, 7, 3, 2, 3, 6, 5, 2, 8, 5, 1, 4, 4, 2, 8, 6, 1, 7, 8, 7, 2, 4, 8, 5, 8, 4, 7, 7, 2, 4, 8, 7, 1, 2, 5, 4, 4, 5, 3, 7, 6, 4, 6, 8, 7, 7, 5, 6, 1, 1, 6, 2, 3, 1, 4, 4, 7, 1, 2, 1, 6, 7, 2, 5, 3, 2, 4, 6, 4, 4, 1, 5, 4, 2, 4, 3, 7, 4, 7, 4, 1, 3, 2, 4, 1, 6, 4, 1, 3, 1, 8, 8, 5, 4, 4, 3, 3, 6, 6, 5, 5, 7, 8, 7, 8, 1, 2, 4, 3, 3, 5, 6, 7, 6, 8, 1, 6, 2, 1, 3, 7, 1, 6, 1, 5, 7, 3, 8, 5, 1, 2, 6, 3, 1, 8, 4, 6, 7, 7, 3, 3, 5, 4, 7, 2, 7, 8, 2, 3, 3, 8, 7, 8, 6, 3, 8, 5, 5, 2, 5, 3, 1, 2, 6, 2, 4, 8, 8, 8, 1, 1, 4, 1, 2, 6, 1, 7, 3, 3, 5, 1, 6, 3, 2, 5, 5, 5, 4, 3, 6, 1, 2, 7, 1, 5, 2, 2, 5, 1, 3, 8, 4, 3, 1, 5, 1, 7, 5, 2, 5, 5, 5, 2, 6, 5, 5, 8, 2, 4, 5, 7, 8, 3, 7, 2, 3, 5, 1, 6, 3, 8, 4, 8, 8, 2, 6, 6, 7, 2, 7, 7, 3, 2, 8, 1, 5, 4, 4, 3, 4, 1, 5, 2, 6, 2, 7, 8, 7, 3, 3, 4, 3, 4, 2, 3, 8, 5, 8, 1, 2, 3, 1, 2, 3, 6, 2, 5, 1, 2, 4, 8, 6, 6, 3, 1, 5, 6, 2, 1, 3, 4, 7, 4, 6, 4, 5, 4, 3, 6, 6, 3, 4, 7, 2, 8, 8, 5, 4, 1, 5, 6, 4, 7, 5, 1, 8, 2, 2, 7, 4, 3, 7, 5, 2, 7, 1, 3, 7, 1, 2, 6, 1, 1, 2, 5, 4, 5, 8, 3, 3, 3, 6, 4, 6, 5, 4, 4, 8, 1, 5, 8, 4, 6, 1, 4, 4, 7, 3, 8, 4, 8, 6, 2, 3, 4, 2, 7, 2, 4, 3, 7, 6, 2, 7, 5, 2, 7, 4, 3, 3, 2, 8, 3, 2, 3, 2, 7, 5, 8, 6, 4, 3, 1, 8, 7, 8, 3, 7, 6, 2, 5, 4, 6, 6, 1, 8, 8, 4, 3, 6, 2, 6, 6, 4, 3, 6, 1, 5, 4, 2, 2, 1, 3, 5, 6, 5, 8, 7, 2, 6, 6, 7, 1, 5, 3, 7, 3, 8, 8, 1, 7, 4, 8, 3, 6, 6, 1, 4, 3, 7, 8, 4, 3, 5, 2, 3, 8, 7, 6, 6, 6, 4, 7, 3, 7, 2, 1, 7, 2, 4, 7, 1, 1, 2, 8, 1, 6, 4, 6, 6, 2, 5, 3, 5, 8, 6, 6, 3, 6, 7, 7, 5, 8, 3, 6, 6, 8, 8, 6, 1, 6, 8, 7, 7, 7, 6, 6, 3, 5, 4, 7, 2, 2, 6, 2, 1, 6, 8, 1, 6, 7, 7, 7, 7, 1, 1, 1, 4, 4, 3, 1, 2, 8, 7, 3, 3, 6, 5, 2, 7, 3, 6, 4, 2, 6, 5, 4, 8, 5, 3, 2, 6, 7, 8, 6, 5, 8, 1, 1, 7, 4, 1, 4, 4, 1, 6, 1, 2, 1, 2, 8, 2, 7, 5, 4, 5, 2, 6, 7, 7, 2, 7, 6, 3, 6, 6, 6, 4, 5, 1, 7, 1, 7, 3, 4, 5, 6, 3, 1, 5, 3, 2, 3, 8, 7, 3, 3, 3, 4, 4, 8, 2, 1, 6, 2, 4, 5, 4, 6, 2, 1, 7, 8, 1, 5, 4, 6, 1, 3, 7, 8, 8, 5, 8, 7, 1, 1, 7, 2, 7, 8, 8, 4, 1, 4, 8, 3, 5, 2, 3, 4, 1, 6, 6, 2, 2, 5, 5, 2, 2, 8, 2, 3, 7, 1, 5, 7, 6, 3, 1, 6, 6, 6, 3, 7, 5, 7, 7, 7, 4, 8, 1, 1, 6, 2, 8, 4, 6, 2, 3, 6, 8, 4, 7, 7, 3, 2, 3, 3, 7, 3, 7, 4, 7, 5, 5, 1, 6, 4, 6, 3, 6, 1, 7, 3, 7, 5, 7, 6, 4, 4, 6, 7, 2, 2, 4, 5, 6, 7, 8, 4, 5, 3, 5, 3, 8, 2, 6, 6, 7, 3, 8, 8, 6, 6, 6, 4, 8, 2, 5, 4, 4, 3, 4, 4, 7, 6, 6, 5, 7, 4, 8, 4, 4, 6, 8, 5, 7, 6, 3, 2, 3, 7, 1, 1, 3, 3, 3, 4, 7, 8, 7, 2, 5, 1, 6, 7, 3, 1, 5, 2, 4, 5, 5, 1, 5, 7, 4, 4, 7, 5, 6, 8, 6, 3, 5, 3, 8, 3, 4, 8, 5, 6, 2, 2, 2, 6, 2, 7, 2, 4, 7, 8, 6, 7, 8, 3, 4, 6, 3, 2, 2, 6, 2, 4, 6, 7, 2, 8, 7, 2, 8, 5, 3, 6, 3, 7, 5, 4, 3, 1, 6, 6, 7, 6, 3, 5, 3, 2, 6, 4, 4, 1, 7, 7, 5, 5, 5, 7, 4, 6, 1, 2, 7, 7, 8, 4, 4, 1, 1, 5, 3, 5, 3, 7, 8, 5, 6, 7, 2, 3, 1, 5, 4, 1, 4, 7, 2, 6, 8, 1, 3, 7, 3, 1, 8, 7, 7, 4, 7, 4, 1, 7, 5, 7, 1, 1, 2, 4, 4, 1, 8, 3, 5, 2, 7, 5, 2, 5, 6, 8, 5, 5, 7, 1, 3, 8, 1, 1, 2, 4, 7, 6, 4, 8, 5, 4, 5, 6, 3, 3, 7, 8, 8, 6, 6, 2, 4, 7, 5, 6, 8, 2, 2, 4, 7, 2, 7, 6, 8, 7, 2, 8, 4, 2, 6, 6, 4, 3, 5, 7, 8, 1, 5, 7, 5, 6, 7, 4, 1, 3, 4, 7, 3, 7, 1, 6, 6, 2, 4, 1, 1, 5, 5, 5, 4, 6, 8, 6, 2, 2, 6, 3, 5, 7, 6, 4, 6, 4, 8, 8, 7, 1, 4, 2, 6, 5, 2, 6, 7, 5, 7, 5, 3, 2, 1, 6, 6, 1, 6, 5, 6, 6, 8, 5, 3, 3, 4, 2, 3, 6, 2, 7, 1, 6, 2, 3, 3, 8, 8, 7, 5, 4, 8, 5, 5, 7, 7, 4, 8, 5, 2, 5, 7, 3, 8, 5, 7, 1, 5, 3, 5, 7, 6, 6, 6, 1, 2, 7, 3, 6, 2, 2, 3, 5, 3, 8, 8, 6, 4, 6, 6, 5, 3, 1, 8, 8, 6, 1, 6, 7, 1, 8, 6, 2, 7, 6, 8, 4, 2, 4, 4, 8, 1, 2, 3, 7, 5, 1, 3, 4, 7, 8, 6, 7, 4, 6, 5, 3, 6, 6, 1, 2, 7, 4, 2, 6, 6, 5, 5, 3, 3, 7, 3, 6, 8, 4, 7, 5, 3, 5, 8, 6, 3, 3, 3, 1, 7, 6, 2, 2, 4, 8, 3, 5, 8, 8, 3, 7, 4, 4, 7, 8, 4, 5, 5, 2, 7, 8, 7, 4, 6, 5, 7, 6, 7, 7, 2, 2, 7, 7, 5, 2, 6, 8, 4, 8, 6, 1, 8, 1, 7, 8, 2, 7, 7, 5, 6, 7, 4, 2, 3, 6, 4, 7, 8, 8, 6, 1, 5, 6, 5, 5, 8, 2, 6, 5, 6, 1, 4, 1, 8, 6, 7, 2, 5, 3, 2, 3, 6, 6, 2, 4, 7, 8, 7, 1, 3, 7, 8, 2, 2, 3, 4, 4, 6, 1, 1, 6, 3, 7, 4, 6, 8, 2, 4, 6, 3, 2, 2, 8, 1, 6, 7, 3, 1, 5, 3, 7, 6, 7, 2, 7, 4, 3, 8, 4, 6, 1, 4, 1, 2, 6, 1, 7, 5, 4, 6, 7, 3, 7, 3, 5, 7, 2, 7, 4, 7, 1, 8, 1, 5, 1, 3, 3, 8, 6, 1, 7, 8, 3, 4, 4, 3, 3, 2, 7, 5, 3, 6, 1, 1, 6, 2, 1, 5, 7, 2, 7, 2, 5, 1, 6, 4, 8, 4, 2, 8, 2, 1, 4, 8, 8, 3, 3, 2, 3, 3, 5, 1, 3, 7, 4, 2, 7, 5, 8, 3, 5, 2, 4, 6, 5, 5, 5, 1, 2, 4, 7, 1, 4, 2, 6, 8, 5, 4, 4, 4, 1, 1, 5, 6, 3, 7, 2, 2, 4, 4, 6, 4, 2, 2, 4, 8, 4, 2, 8, 4, 2, 4, 4, 4, 5, 2, 1, 1, 7, 6, 8, 7, 6, 3, 1, 2, 1, 2, 2, 5, 3, 5, 4, 6, 4, 6, 2, 3, 4, 8, 7, 8, 6, 1, 8, 1, 4, 1, 4, 5, 2, 3, 2, 2, 6, 7, 4, 8, 7, 8, 5, 7, 7, 8, 2, 2, 4, 5, 2, 8, 8, 5, 8, 7, 7, 1, 2, 6, 4, 6, 2, 6, 1, 2, 6, 2, 7, 1, 3, 2, 4, 2, 5, 7, 4, 5, 6, 6, 7, 4, 6, 8, 5, 2, 3, 8, 8, 7, 8, 2, 4, 7, 5, 1, 7, 7, 1, 5, 8, 8, 4, 5, 4, 6, 4, 8, 7, 2, 3, 5, 6, 5, 2, 6, 7, 2, 2, 4, 8, 6, 5, 7, 3, 8, 1, 2, 6, 3, 7, 3, 2, 2, 3, 2, 7, 8, 5, 8, 4, 4, 7, 6, 5, 6, 7, 5, 5, 4, 3, 4, 3, 5, 7, 8, 3, 5, 3, 3, 5, 5, 4, 4, 1, 6, 5, 1, 6, 3, 2, 5, 2, 8, 3, 5, 1, 5, 2, 6, 2, 3, 3, 8, 5, 6, 7, 3, 2, 2, 4, 1, 7, 8, 6, 8, 2, 5, 8, 3, 8, 2, 8, 8, 1, 2, 1, 1, 7, 2, 2, 3, 8, 4, 7, 6, 6, 5, 8, 6, 5, 5, 7, 6, 7, 2, 1]
Then we create an array A
, of the same length that is populated multiples of the number 8. Added together A
and idx
give the indexes of one row from each set of 8s.
A = repeat([8], inner=len-1)
push!(A, 0)
A = cumsum(A)
idx = idx .+ A
show(idx)
[16, 23, 31, 38, 41, 56, 63, 67, 73, 86, 92, 102, 106, 115, 125, 136, 139, 145, 154, 167, 170, 178, 187, 194, 205, 215, 219, 230, 233, 246, 250, 262, 270, 274, 286, 291, 299, 311, 318, 323, 329, 337, 350, 354, 364, 374, 379, 388, 400, 403, 413, 421, 429, 438, 444, 454, 464, 470, 480, 486, 494, 503, 507, 513, 522, 532, 540, 546, 559, 562, 571, 580, 586, 599, 605, 616, 619, 630, 636, 643, 654, 657, 669, 680, 684, 695, 701, 705, 720, 724, 735, 740, 746, 756, 766, 775, 778, 787, 794, 806, 816, 817, 828, 838, 848, 856, 861, 871, 874, 882, 890, 902, 912, 920, 922, 936, 941, 945, 955, 961, 971, 983, 987, 993, 1007, 1014, 1021, 1025, 1033, 1044, 1051, 1059, 1072, 1074, 1087, 1089, 1102, 1110, 1115, 1125, 1135, 1139, 1146, 1155, 1166, 1173, 1178, 1192, 1197, 1201, 1212, 1220, 1226, 1240, 1246, 1249, 1263, 1272, 1279, 1282, 1292, 1304, 1309, 1320, 1324, 1335, 1343, 1346, 1356, 1368, 1375, 1377, 1386, 1397, 1404, 1412, 1421, 1427, 1439, 1446, 1452, 1462, 1472, 1479, 1487, 1493, 1502, 1505, 1513, 1526, 1530, 1539, 1545, 1556, 1564, 1575, 1577, 1586, 1593, 1606, 1615, 1618, 1629, 1635, 1642, 1652, 1662, 1668, 1676, 1681, 1693, 1700, 1706, 1716, 1723, 1735, 1740, 1751, 1756, 1761, 1771, 1778, 1788, 1793, 1806, 1812, 1817, 1827, 1833, 1848, 1856, 1861, 1868, 1876, 1883, 1891, 1902, 1910, 1917, 1925, 1935, 1944, 1951, 1960, 1961, 1970, 1980, 1987, 1995, 2005, 2014, 2023, 2030, 2040, 2041, 2054, 2058, 2065, 2075, 2087, 2089, 2102, 2105, 2117, 2127, 2131, 2144, 2149, 2153, 2162, 2174, 2179, 2185, 2200, 2204, 2214, 2223, 2231, 2235, 2243, 2253, 2260, 2271, 2274, 2287, 2296, 2298, 2307, 2315, 2328, 2335, 2344, 2350, 2355, 2368, 2373, 2381, 2386, 2397, 2403, 2409, 2418, 2430, 2434, 2444, 2456, 2464, 2472, 2473, 2481, 2492, 2497, 2506, 2518, 2521, 2535, 2539, 2547, 2557, 2561, 2574, 2579, 2586, 2597, 2605, 2613, 2620, 2627, 2638, 2641, 2650, 2663, 2665, 2677, 2682, 2690, 2701, 2705, 2715, 2728, 2732, 2739, 2745, 2757, 2761, 2775, 2781, 2786, 2797, 2805, 2813, 2818, 2830, 2837, 2845, 2856, 2858, 2868, 2877, 2887, 2896, 2899, 2911, 2914, 2923, 2933, 2937, 2950, 2955, 2968, 2972, 2984, 2992, 2994, 3006, 3014, 3023, 3026, 3039, 3047, 3051, 3058, 3072, 3073, 3085, 3092, 3100, 3107, 3116, 3121, 3133, 3138, 3150, 3154, 3167, 3176, 3183, 3187, 3195, 3204, 3211, 3220, 3226, 3235, 3248, 3253, 3264, 3265, 3274, 3283, 3289, 3298, 3307, 3318, 3322, 3333, 3337, 3346, 3356, 3368, 3374, 3382, 3387, 3393, 3405, 3414, 3418, 3425, 3435, 3444, 3455, 3460, 3470, 3476, 3485, 3492, 3499, 3510, 3518, 3523, 3532, 3543, 3546, 3560, 3568, 3573, 3580, 3585, 3597, 3606, 3612, 3623, 3629, 3633, 3648, 3650, 3658, 3671, 3676, 3683, 3695, 3701, 3706, 3719, 3721, 3731, 3743, 3745, 3754, 3766, 3769, 3777, 3786, 3797, 3804, 3813, 3824, 3827, 3835, 3843, 3854, 3860, 3870, 3877, 3884, 3892, 3904, 3905, 3917, 3928, 3932, 3942, 3945, 3956, 3964, 3975, 3979, 3992, 3996, 4008, 4014, 4018, 4027, 4036, 4042, 4055, 4058, 4068, 4075, 4087, 4094, 4098, 4111, 4117, 4122, 4135, 4140, 4147, 4155, 4162, 4176, 4179, 4186, 4195, 4202, 4215, 4221, 4232, 4238, 4244, 4251, 4257, 4272, 4279, 4288, 4291, 4303, 4310, 4314, 4325, 4332, 4342, 4350, 4353, 4368, 4376, 4380, 4387, 4398, 4402, 4414, 4422, 4428, 4435, 4446, 4449, 4461, 4468, 4474, 4482, 4489, 4499, 4509, 4518, 4525, 4536, 4543, 4546, 4558, 4566, 4575, 4577, 4589, 4595, 4607, 4611, 4624, 4632, 4633, 4647, 4652, 4664, 4667, 4678, 4686, 4689, 4700, 4707, 4719, 4728, 4732, 4739, 4749, 4754, 4763, 4776, 4783, 4790, 4798, 4806, 4812, 4823, 4827, 4839, 4842, 4849, 4863, 4866, 4876, 4887, 4889, 4897, 4906, 4920, 4921, 4934, 4940, 4950, 4958, 4962, 4973, 4979, 4989, 5000, 5006, 5014, 5019, 5030, 5039, 5047, 5053, 5064, 5067, 5078, 5086, 5096, 5104, 5110, 5113, 5126, 5136, 5143, 5151, 5159, 5166, 5174, 5179, 5189, 5196, 5207, 5210, 5218, 5230, 5234, 5241, 5254, 5264, 5265, 5278, 5287, 5295, 5303, 5311, 5313, 5321, 5329, 5340, 5348, 5355, 5361, 5370, 5384, 5391, 5395, 5403, 5414, 5421, 5426, 5439, 5443, 5454, 5460, 5466, 5478, 5485, 5492, 5504, 5509, 5515, 5522, 5534, 5543, 5552, 5558, 5565, 5576, 5577, 5585, 5599, 5604, 5609, 5620, 5628, 5633, 5646, 5649, 5658, 5665, 5674, 5688, 5690, 5703, 5709, 5716, 5725, 5730, 5742, 5751, 5759, 5762, 5775, 5782, 5787, 5798, 5806, 5814, 5820, 5829, 5833, 5847, 5849, 5863, 5867, 5876, 5885, 5894, 5899, 5905, 5917, 5923, 5930, 5939, 5952, 5959, 5963, 5971, 5979, 5988, 5996, 6008, 6010, 6017, 6030, 6034, 6044, 6053, 6060, 6070, 6074, 6081, 6095, 6104, 6105, 6117, 6124, 6134, 6137, 6147, 6159, 6168, 6176, 6181, 6192, 6199, 6201, 6209, 6223, 6226, 6239, 6248, 6256, 6260, 6265, 6276, 6288, 6291, 6301, 6306, 6315, 6324, 6329, 6342, 6350, 6354, 6362, 6373, 6381, 6386, 6394, 6408, 6410, 6419, 6431, 6433, 6445, 6455, 6462, 6467, 6473, 6486, 6494, 6502, 6507, 6519, 6525, 6535, 6543, 6551, 6556, 6568, 6569, 6577, 6590, 6594, 6608, 6612, 6622, 6626, 6635, 6646, 6656, 6660, 6671, 6679, 6683, 6690, 6699, 6707, 6719, 6723, 6735, 6740, 6751, 6757, 6765, 6769, 6782, 6788, 6798, 6803, 6814, 6817, 6831, 6835, 6847, 6853, 6863, 6870, 6876, 6884, 6894, 6903, 6906, 6914, 6924, 6933, 6942, 6951, 6960, 6964, 6973, 6979, 6989, 6995, 7008, 7010, 7022, 7030, 7039, 7043, 7056, 7064, 7070, 7078, 7086, 7092, 7104, 7106, 7117, 7124, 7132, 7139, 7148, 7156, 7167, 7174, 7182, 7189, 7199, 7204, 7216, 7220, 7228, 7238, 7248, 7253, 7263, 7270, 7275, 7282, 7291, 7303, 7305, 7313, 7323, 7331, 7339, 7348, 7359, 7368, 7375, 7378, 7389, 7393, 7406, 7415, 7419, 7425, 7437, 7442, 7452, 7461, 7469, 7473, 7485, 7495, 7500, 7508, 7519, 7525, 7534, 7544, 7550, 7555, 7565, 7571, 7584, 7587, 7596, 7608, 7613, 7622, 7626, 7634, 7642, 7654, 7658, 7671, 7674, 7684, 7695, 7704, 7710, 7719, 7728, 7731, 7740, 7750, 7755, 7762, 7770, 7782, 7786, 7796, 7806, 7815, 7818, 7832, 7839, 7842, 7856, 7861, 7867, 7878, 7883, 7895, 7901, 7908, 7915, 7921, 7934, 7942, 7951, 7958, 7963, 7973, 7979, 7986, 7998, 8004, 8012, 8017, 8031, 8039, 8045, 8053, 8061, 8071, 8076, 8086, 8089, 8098, 8111, 8119, 8128, 8132, 8140, 8145, 8153, 8165, 8171, 8181, 8187, 8199, 8208, 8213, 8222, 8231, 8234, 8243, 8249, 8261, 8268, 8273, 8284, 8295, 8298, 8310, 8320, 8321, 8331, 8343, 8347, 8353, 8368, 8375, 8383, 8388, 8399, 8404, 8409, 8423, 8429, 8439, 8441, 8449, 8458, 8468, 8476, 8481, 8496, 8499, 8509, 8514, 8527, 8533, 8538, 8549, 8558, 8568, 8573, 8581, 8591, 8593, 8603, 8616, 8617, 8625, 8634, 8644, 8655, 8662, 8668, 8680, 8685, 8692, 8701, 8710, 8715, 8723, 8735, 8744, 8752, 8758, 8766, 8770, 8780, 8791, 8797, 8806, 8816, 8818, 8826, 8836, 8847, 8850, 8863, 8870, 8880, 8887, 8890, 8904, 8908, 8914, 8926, 8934, 8940, 8947, 8957, 8967, 8976, 8977, 8989, 8999, 9005, 9014, 9023, 9028, 9033, 9043, 9052, 9063, 9067, 9079, 9081, 9094, 9102, 9106, 9116, 9121, 9129, 9141, 9149, 9157, 9164, 9174, 9184, 9190, 9194, 9202, 9214, 9219, 9229, 9239, 9246, 9252, 9262, 9268, 9280, 9288, 9295, 9297, 9308, 9314, 9326, 9333, 9338, 9350, 9359, 9365, 9375, 9381, 9387, 9394, 9401, 9414, 9422, 9425, 9438, 9445, 9454, 9462, 9472, 9477, 9483, 9491, 9500, 9506, 9515, 9526, 9530, 9543, 9545, 9558, 9562, 9571, 9579, 9592, 9600, 9607, 9613, 9620, 9632, 9637, 9645, 9655, 9663, 9668, 9680, 9685, 9690, 9701, 9711, 9715, 9728, 9733, 9743, 9745, 9757, 9763, 9773, 9783, 9790, 9798, 9806, 9809, 9818, 9831, 9835, 9846, 9850, 9858, 9867, 9877, 9883, 9896, 9904, 9910, 9916, 9926, 9934, 9941, 9947, 9953, 9968, 9976, 9982, 9985, 9998, 10007, 10009, 10024, 10030, 10034, 10047, 10054, 10064, 10068, 10074, 10084, 10092, 10104, 10105, 10114, 10123, 10135, 10141, 10145, 10155, 10164, 10175, 10184, 10190, 10199, 10204, 10214, 10221, 10227, 10238, 10246, 10249, 10258, 10271, 10276, 10282, 10294, 10302, 10309, 10317, 10323, 10331, 10343, 10347, 10358, 10368, 10372, 10383, 10389, 10395, 10405, 10416, 10422, 10427, 10435, 10443, 10449, 10463, 10470, 10474, 10482, 10492, 10504, 10507, 10517, 10528, 10536, 10539, 10551, 10556, 10564, 10575, 10584, 10588, 10597, 10605, 10610, 10623, 10632, 10639, 10644, 10654, 10661, 10671, 10678, 10687, 10695, 10698, 10706, 10719, 10727, 10733, 10738, 10750, 10760, 10764, 10776, 10782, 10785, 10800, 10801, 10815, 10824, 10826, 10839, 10847, 10853, 10862, 10871, 10876, 10882, 10891, 10902, 10908, 10919, 10928, 10936, 10942, 10945, 10957, 10966, 10973, 10981, 10992, 10994, 11006, 11013, 11022, 11025, 11036, 11041, 11056, 11062, 11071, 11074, 11085, 11091, 11098, 11107, 11118, 11126, 11130, 11140, 11151, 11160, 11167, 11169, 11179, 11191, 11200, 11202, 11210, 11219, 11228, 11236, 11246, 11249, 11257, 11270, 11275, 11287, 11292, 11302, 11312, 11314, 11324, 11334, 11339, 11346, 11354, 11368, 11369, 11382, 11391, 11395, 11401, 11413, 11419, 11431, 11438, 11447, 11450, 11463, 11468, 11475, 11488, 11492, 11502, 11505, 11516, 11521, 11530, 11542, 11545, 11559, 11565, 11572, 11582, 11591, 11595, 11607, 11611, 11621, 11631, 11634, 11647, 11652, 11663, 11665, 11680, 11681, 11693, 11697, 11707, 11715, 11728, 11734, 11737, 11751, 11760, 11763, 11772, 11780, 11787, 11795, 11802, 11815, 11821, 11827, 11838, 11841, 11849, 11862, 11866, 11873, 11885, 11895, 11898, 11911, 11914, 11925, 11929, 11942, 11948, 11960, 11964, 11970, 11984, 11986, 11993, 12004, 12016, 12024, 12027, 12035, 12042, 12051, 12059, 12069, 12073, 12083, 12095, 12100, 12106, 12119, 12125, 12136, 12139, 12149, 12154, 12164, 12174, 12181, 12189, 12197, 12201, 12210, 12220, 12231, 12233, 12244, 12250, 12262, 12272, 12277, 12284, 12292, 12300, 12305, 12313, 12325, 12334, 12339, 12351, 12354, 12362, 12372, 12380, 12390, 12396, 12402, 12410, 12420, 12432, 12436, 12442, 12456, 12460, 12466, 12476, 12484, 12492, 12501, 12506, 12513, 12521, 12535, 12542, 12552, 12559, 12566, 12571, 12577, 12586, 12593, 12602, 12610, 12621, 12627, 12637, 12644, 12654, 12660, 12670, 12674, 12683, 12692, 12704, 12711, 12720, 12726, 12729, 12744, 12745, 12756, 12761, 12772, 12781, 12786, 12795, 12802, 12810, 12822, 12831, 12836, 12848, 12855, 12864, 12869, 12879, 12887, 12896, 12898, 12906, 12916, 12925, 12930, 12944, 12952, 12957, 12968, 12975, 12983, 12985, 12994, 13006, 13012, 13022, 13026, 13038, 13041, 13050, 13062, 13066, 13079, 13081, 13091, 13098, 13108, 13114, 13125, 13135, 13140, 13149, 13158, 13166, 13175, 13180, 13190, 13200, 13205, 13210, 13219, 13232, 13240, 13247, 13256, 13258, 13268, 13279, 13285, 13289, 13303, 13311, 13313, 13325, 13336, 13344, 13348, 13357, 13364, 13374, 13380, 13392, 13399, 13402, 13411, 13421, 13430, 13437, 13442, 13454, 13463, 13466, 13474, 13484, 13496, 13502, 13509, 13519, 13523, 13536, 13537, 13546, 13558, 13563, 13575, 13579, 13586, 13594, 13603, 13610, 13623, 13632, 13637, 13648, 13652, 13660, 13671, 13678, 13685, 13694, 13703, 13709, 13717, 13724, 13731, 13740, 13747, 13757, 13767, 13776, 13779, 13789, 13795, 13803, 13813, 13821, 13828, 13836, 13841, 13854, 13861, 13865, 13878, 13883, 13890, 13901, 13906, 13920, 13923, 13933, 13937, 13949, 13954, 13966, 13970, 13979, 13987, 14000, 14005, 14014, 14023, 14027, 14034, 14042, 14052, 14057, 14071, 14080, 14086, 14096, 14098, 14109, 14120, 14123, 14136, 14138, 14152, 14160, 14161, 14170, 14177, 14185, 14199, 14202, 14210, 14219, 14232, 14236, 14247, 14254, 14262, 14269, 14280, 14286, 14293, 14301, 14311, 14318, 14327, 14330, 14329]
Reduce the balanced fully crossed design to the original experimental design:
fake_kb07_df= fake_kb07_df[idx, :]
rename!(fake_kb07_df, :dv => :rt_trunc)
Now we can use the simulated data in the same way as above.
Set contrasts:
contrasts = Dict(:spkr => HelmertCoding(),
:prec => HelmertCoding(),
:load => HelmertCoding(),
:item => Grouping(),
:subj => Grouping());
Dict{Symbol, StatsModels.AbstractContrasts} with 5 entries:
:item => Grouping()
:spkr => HelmertCoding(nothing, nothing)
:load => HelmertCoding(nothing, nothing)
:prec => HelmertCoding(nothing, nothing)
:subj => Grouping()
Define formula, same as above:
kb07_f = @formula(rt_trunc ~ 1 + spkr + prec + load + (1|subj) + (1+prec|item));
FormulaTerm
Response:
rt_trunc(unknown)
Predictors:
1
spkr(unknown)
prec(unknown)
load(unknown)
(subj)->1 | subj
(prec,item)->(1 + prec) | item
Fit the model:
fake_kb07_m = fit(MixedModel, kb07_f, fake_kb07_df; contrasts=contrasts)
Linear mixed model fit by maximum likelihood
rt_trunc ~ 1 + spkr + prec + load + (1 | subj) + (1 + prec | item)
logLik -2 logLik AIC AICc BIC
-2524.3953 5048.7906 5066.7906 5066.8916 5116.2104
Variance components:
Column Variance Std.Dev. Corr.
item (Intercept) 0.0002647 0.0162699
prec: maintain 0.0038990 0.0624419 -1.00
subj (Intercept) 0.0086079 0.0927790
Residual 0.9683911 0.9840687
Number of obs: 1792; levels of grouping factors: 32, 56
Fixed-effects parameters:
────────────────────────────────────────────────────────
Coef. Std. Error z Pr(>|z|)
────────────────────────────────────────────────────────
(Intercept) 0.0509777 0.0265559 1.92 0.0549
spkr: old -0.00380141 0.0234579 -0.16 0.8713
prec: maintain -0.00331041 0.0258544 -0.13 0.8981
load: yes 0.0312056 0.0233694 1.34 0.1818
────────────────────────────────────────────────────────
Set random seed for reproducibility:
rng = StableRNG(42);
StableRNGs.LehmerRNG(state=0x00000000000000000000000000000055)
Then, again, we specify β
, σ
, and θ
. Here we use the values that we found in the model of the existing dataset:
#beta
new_beta = [2181.85, 67.879, -333.791, 78.5904] #manual
new_beta = kb07_m.β #grab from existing model
#sigma
new_sigma = 680.032 #manual
new_sigma = kb07_m.σ #grab from existing model
#theta
re_item_corr = [1.0 -0.7; -0.7 1.0]
re_item = create_re(0.536, 0.371; corrmat = re_item_corr)
re_subj = create_re(0.438)
new_theta = createθ(kb07_m; item=re_item, subj=re_subj)
4-element Vector{Float64}:
0.536
-0.2597
0.26494699469893973
0.438
Run nsims iterations:
fake_kb07_sim = parametricbootstrap(rng, nsims, fake_kb07_m,
β = new_beta,
σ = new_sigma,
θ = new_theta,
use_threads = false);
MixedModelBootstrap with 100 samples
parameter min q25 median mean q75 ⋯
┌────────────────────────────────────────────────────────────────────
1 │ β1 1999.99 2126.27 2174.94 2175.29 2223.64 ⋯
2 │ β2 32.2888 59.4361 71.9847 70.5161 83.3956 ⋯
3 │ β3 183.955 305.506 329.418 331.619 362.811 ⋯
4 │ β4 37.6806 66.8622 79.7979 79.1976 91.9039 ⋯
5 │ σ 647.511 670.822 680.682 679.62 688.46 ⋯
6 │ σ1 207.197 317.107 356.476 353.424 393.228 ⋯
7 │ σ2 161.91 214.082 238.959 243.761 268.689 ⋯
8 │ ρ1 -0.933227 -0.789665 -0.695095 -0.69707 -0.622556 ⋯
9 │ σ3 200.195 268.904 296.58 295.702 323.656 ⋯
10 │ θ1 0.310354 0.465926 0.526513 0.520171 0.570751 ⋯
11 │ θ2 -0.424107 -0.295891 -0.25059 -0.252727 -0.202564 ⋯
12 │ θ3 0.150357 0.214558 0.248308 0.247738 0.28456 ⋯
13 │ θ4 0.289254 0.395085 0.434854 0.435185 0.475661 ⋯
Power calculation
power_table(fake_kb07_sim)
4-element Vector{@NamedTuple{coefname::String, power::Float64}}:
(coefname = "(Intercept)", power = 1.0)
(coefname = "spkr: old", power = 0.99)
(coefname = "prec: maintain", power = 1.0)
(coefname = "load: yes", power = 1.0)
Compare to the powertable from the existing data:
power_table(kb07_sim)
4-element Vector{@NamedTuple{coefname::String, power::Float64}}:
(coefname = "prec: break", power = 1.0)
(coefname = "(Intercept)", power = 1.0)
(coefname = "spkr: old", power = 0.99)
(coefname = "load: yes", power = 1.0)
We have successfully recreated the power simulation of an existing dataset from scratch. This has the advantage, that we now can iterate over different numbers of subjects and items.
Loop over subject and item sizes
When designing a study, you may be interested in trying various numbers of subjects and items to see how that affects the power of your study. To do this you can use a loop to run simulations over a range of values for any parameter.
We first define every fixed things outside the loop (same as above):
Define factors in a dict:
subj_btwn = nothing
item_btwn = nothing
both_win = Dict(:spkr => ["old", "new"],
:prec => ["maintain", "break"],
:load => ["yes", "no"]);
Dict{Symbol, Vector{String}} with 3 entries:
:spkr => ["old", "new"]
:load => ["yes", "no"]
:prec => ["maintain", "break"]
Set contrasts:
contrasts = Dict(:spkr => HelmertCoding(),
:prec => HelmertCoding(base="maintain"),
:load => HelmertCoding());
Dict{Symbol, StatsModels.HelmertCoding} with 3 entries:
:spkr => HelmertCoding(nothing, nothing)
:load => HelmertCoding(nothing, nothing)
:prec => HelmertCoding("maintain", nothing)
Set random seed for reproducibility:
rng = StableRNG(42);
StableRNGs.LehmerRNG(state=0x00000000000000000000000000000055)
Define formula:
kb07_f = @formula(rt_trunc ~ 1 + spkr + prec + load + (1 | subj) + (1 + prec | item));
FormulaTerm
Response:
rt_trunc(unknown)
Predictors:
1
spkr(unknown)
prec(unknown)
load(unknown)
(subj)->1 | subj
(prec,item)->(1 + prec) | item
a Specify β
, σ
, and θ
:
#beta
new_beta = [2181.85, 67.879, -333.791, 78.5904]
new_beta = kb07_m.β
#sigma
new_sigma = 680.032
new_sigma = kb07_m.σ
#theta
re_item_corr = [1.0 -0.7; -0.7 1.0]
re_item = create_re(0.536, 0.371; corrmat = re_item_corr)
re_subj = create_re(0.438)
# because the ordering in theta is dependent on the design,
# we will create it within the loop as we change the design parameters
# (e.g. n subjects and n items)
1×1 LinearAlgebra.LowerTriangular{Float64, LinearAlgebra.Diagonal{Float64, Vector{Float64}}}:
0.438
Then we define the variables that out loop will iterate over
Define subject and item numbers as arrays:
sub_ns = [20, 30, 40];
item_ns = [16, 24, 32];
3-element Vector{Int64}:
16
24
32
Make an empty dataframe:
d = DataFrame();
Run the loop:
@showprogress for subj_n in sub_ns, item_n in item_ns
# Make balanced fully crossed data:
fake = simdat_crossed(subj_n, item_n;
subj_btwn = subj_btwn,
item_btwn = item_btwn,
both_win = both_win);
fake_df = DataFrame(fake)
# Reduce the balanced fully crossed design to the original experimental design:
fake_df = sort(fake_df, [:subj, :item, :load, :prec, :spkr])
local len = convert(Int64,(length(fake)/8))
local idx = rand(rng, collect(1:8) , len)
local A = repeat([8], inner=len-1)
push!(A, 0)
A = cumsum(A)
idx = idx + A
fake_df = fake_df[idx, :]
rename!(fake_df, :dv => :rt_trunc)
# create the model:
fake_m = LinearMixedModel(kb07_f, fake_df, contrasts=contrasts);
local new_theta = createθ(fake_m; item=re_item, subj=re_subj)
# Run nsims iterations:
fake_sim = parametricbootstrap(rng, nsims, fake_m,
β = new_beta,
σ = new_sigma,
θ = new_theta,
use_threads = false,
hide_progress=true);
# Power calculation
local ptbl = power_table(fake_sim)
ptdf = DataFrame(ptbl)
ptdf[!, :item_n] .= item_n
ptdf[!, :subj_n] .= subj_n
append!(d, ptdf)
end
┌ Warning: `hide_progress` is deprecated, please use `progress` instead.NB: `progress` is a positive action, i.e. `progress=true` means show the progress bar.
│ caller = kwcall(::@NamedTuple{β::Vector{Float64}, σ::Float64, θ::Vector{Float64}, use_threads::Bool, hide_progress::Bool}, ::typeof(MixedModels.parametricbootstrap), rng::StableRNGs.LehmerRNG, n::Int64, morig::MixedModels.LinearMixedModel{Float64}) at bootstrap.jl:212
└ @ MixedModels ~/.julia/packages/MixedModels/pQUR7/src/bootstrap.jl:212
Progress: 67%|███████████████████████████▍ | ETA: 0:00:02
Progress: 100%|█████████████████████████████████████████| Time: 0:00:05
Our dataframe d
now contains the power information for each combination of subjects and items.
first(d, 10)
Row | coefname | power | item_n | subj_n |
---|---|---|---|---|
String | Float64 | Int64 | Int64 | |
1 | prec: break | 0.99 | 16 | 20 |
2 | (Intercept) | 1.0 | 16 | 20 |
3 | spkr: old | 0.43 | 16 | 20 |
4 | load: yes | 0.49 | 16 | 20 |
5 | prec: break | 1.0 | 24 | 20 |
6 | (Intercept) | 1.0 | 24 | 20 |
7 | spkr: old | 0.59 | 24 | 20 |
8 | load: yes | 0.63 | 24 | 20 |
9 | prec: break | 1.0 | 32 | 20 |
10 | (Intercept) | 1.0 | 32 | 20 |
Lastly we plot our results:
function subplot_power(f::Figure, dat, coefname)
coefpow = @subset(dat, :coefname == coefname)
ax = Axis(f;
xlabel="n subjects",
ylabel="n items",
#aspect=AxisAspect(1),
title="$(coefname)")
heatmap!(ax, coefpow.subj_n, coefpow.item_n, coefpow.power; colorrange=[0,1])
return ax
end
fig = Figure(; resolution=(1300,300))
for (idx, cn) in enumerate(sort(unique(d.coefname)))
fig[1, idx] = subplot_power(fig, d, cn)
end
Colorbar(fig[1, end+1]; label="power", vertical=true, colorrange=[0,1], colormap=:viridis)
fig
Acknowledgements
The text here is based on a tutorial presented at a ZiF workshop by Lisa DeBruine (Feb. 2020) and presented again by Phillip Alday during the SMLP Summer School (Sep. 2020).
Updated and extended by Lisa Schwetlick & Daniel Backhaus, with the kind help of Phillip Alday, after changes to the package.