Code
using Arrow
using Chain
using DataFrames
using MixedModels
using ProgressMeter
using StatsBase
using StatsModels
using StatsModels: ContrastsCoding
ijulia_behavior(:clear); ProgressMeter.
Phillip Alday, Douglas Bates, and Reinhold Kliegl
2022-09-27
For a (quasi-)experimental set of data, there is (or should be) a clear a priori theoretical committment to specific hypotheses about differences between factor levels and how these differences enter in interactions with other factors. This specification should be used in the first LMM and reported, irrespective of the outcome. If alternative theories lead to alternative a priori contrast specifications, both analyses are justified. If the observed means render the specification completely irrelevant, the comparisons originally planned could still be reported in a Supplement).
In this script, we are working through a large number of different contrasts for the same data. The purpose is to introduce both the preprogrammed (“canned”) and the general options to specify hypotheses about main effects and interactions. Obviously, we do not endorse generating a plot of the means and specifying the contrasts accordingly. This is known as the Texas sharpshooter fallacy. The link leads to an illustration and brief historical account by Wagenmakers (2018).
Irrespective of how results turn out, there is nothing wrong with specifying a set of post-hoc contrasts to gain a better understanding of what the data are trying to tell us. Of course, in an article or report about the study, the a priori and post-hoc nature of contrast specifications must be made clear. Some kind of alpha-level adjustment (e.g., Bonferroni) may be called for, too. And, of course, there are grey zones.
There is quite a bit of statistical literature on contrasts. Two “local” references are:
Brehm, L., & Alday, P. M., (2022). Contrast coding choices in a decade of mixed models. Journal of Memory and Language, 125, 104334.
Schad, D. J., Vasishth, S., Hohenstein, S., & Kliegl, R. (2020). How to capitalize on a priori contrasts in linear (mixed) models: A tutorial. Journal of Memory and Language, 110, 104038.
For further readings see “Further Readings” in Schad et al. (2020).
We take the KWDYZ
dataset (Kliegl et al., 2010). This is an experiment looking at three effects of visual cueing under four different cue-target relations (CTRs). Two horizontal rectangles are displayed above and below a central fixation point or they displayed in vertical orientation to the left and right of the fixation point. Subjects react to the onset of a small visual target occuring at one of the four ends of the two rectangles. The target is cued validly on 70% of trials by a brief flash of the corner of the rectangle at which it appears; it is cued invalidly at the three other locations 10% of the trials each.
We specify three contrasts for the four-level factor CTR that are derived from spatial, object-based, and attractor-like features of attention. They map onto sequential differences between appropriately ordered factor levels.
We also have a dataset from a replication and extension of this study (Kliegl, Kuschela, & Laubrock, 2015). Both data sets are available in R-package RePsychLing
dat1 = @chain "./data/kwdyz11.arrow" begin
Arrow.Table
DataFrame
select!(:subj => :Subj, :tar => :CTR, :rt)
end
cellmeans = combine(
groupby(dat1, [:CTR]),
:rt => mean,
:rt => std,
:rt => length,
:rt => (x -> std(x) / sqrt(length(x))) => :rt_semean,
)
4 rows × 5 columns
CTR | rt_mean | rt_std | rt_length | rt_semean | |
---|---|---|---|---|---|
String | Float64 | Float64 | Int64 | Float64 | |
1 | val | 358.032 | 83.4581 | 20141 | 0.588069 |
2 | sod | 391.267 | 92.662 | 2863 | 1.73177 |
3 | dos | 405.146 | 92.6893 | 2843 | 1.73837 |
4 | dod | 402.3 | 95.3914 | 2863 | 1.78278 |
We use the same formula for all analyses
FormulaTerm
Response:
rt(unknown)
Predictors:
1
CTR(unknown)
(CTR,Subj)->(1 + CTR) | Subj
This is the default order of factor levels.
Controlling the ordering of levels for contrasts:
levels
to order the levelsbase
a different level can be specified.The SeqDiffCoding
contrast corresponds to MASS::contr.sdif()
in R. The assignment of random factors such as Subj
to Grouping()
is necessary when the sample size is very large. We recommend to include it always, but in this tutorial we do so only in the first example.
form = @formula rt ~ 1 + CTR + (1 + CTR | Subj)
levels = ["val", "sod", "dos", "dod"]
m1 = let
contrasts = Dict(
:CTR => SeqDiffCoding(; levels),
:Subj => Grouping()
)
fit(MixedModel, form, dat1; contrasts)
end
Minimizing 202 Time: 0:00:00 ( 2.07 ms/it)
objective: 325809.5493487461
Est. | SE | z | p | σ_Subj | |
---|---|---|---|---|---|
(Intercept) | 389.7336 | 7.0914 | 54.96 | <1e-99 | 55.2001 |
CTR: sod | 33.7817 | 3.2874 | 10.28 | <1e-24 | 23.2485 |
CTR: dos | 13.9852 | 2.3056 | 6.07 | <1e-08 | 10.7520 |
CTR: dod | -2.7470 | 2.2142 | -1.24 | 0.2147 | 9.5093 |
Residual | 69.8348 |
HypothesisCoding
is the most general option available. We can implement all “canned” contrasts ourselves. The next example reproduces the test statistcs from SeqDiffCoding
- with a minor modification illustrating the flexibility of going beyond the default version.
m1b = let
contrasts = Dict(
:CTR => HypothesisCoding(
[
-1 1 0 0
0 -1 1 0
0 0 1 -1
];
levels,
labels=["spt", "obj", "grv"],
),
)
fit(MixedModel, form, dat1; contrasts)
end
Est. | SE | z | p | σ_Subj | |
---|---|---|---|---|---|
(Intercept) | 389.7336 | 7.0910 | 54.96 | <1e-99 | 55.1977 |
CTR: spt | 33.7817 | 3.2876 | 10.28 | <1e-24 | 23.2499 |
CTR: obj | 13.9852 | 2.3058 | 6.07 | <1e-08 | 10.7556 |
CTR: grv | 2.7470 | 2.2144 | 1.24 | 0.2148 | 9.5116 |
Residual | 69.8348 |
The difference to the preprogrammed SeqDiffCoding
is that for the third contrast we changed the direction of the contrast such that the sign of the effect is positive when the result is in agreement with theoretical expectation, that is we subtract the fourth level from the third, not the third level from the fourth.
Thi contrast corresponds to contr.treatment()
in R
m2 = let
contrasts = Dict(:CTR => DummyCoding(; base="val"))
fit(MixedModel, form, dat1; contrasts)
end
Est. | SE | z | p | σ_Subj | |
---|---|---|---|---|---|
(Intercept) | 358.0914 | 6.1516 | 58.21 | <1e-99 | 47.8919 |
CTR: dod | 45.0200 | 4.3644 | 10.32 | <1e-24 | 32.2974 |
CTR: dos | 47.7669 | 3.5573 | 13.43 | <1e-40 | 25.5426 |
CTR: sod | 33.7818 | 3.2886 | 10.27 | <1e-24 | 23.2586 |
Residual | 69.8348 |
The DummyCoding
contrast has the disadvantage that the intercept returns the mean of the level specified as base
, default is the first level, not the GM.
The contrasts returned by DummyCoding
may be exactly what we want. Can’t we have them, but also have the intercept estimate the GM, rather than the mean of the base level? Yes, we can! We call this “You can have your cake and it eat, too”-Coding (YchycaeitCoding). And we use HpothesisCoding
to achieve this outcome.
m2b = let
contrasts = Dict(
:CTR => HypothesisCoding(
[
-1 1 0 0
-1 0 1 0
-1 0 0 1
];
levels,
)
)
fit(MixedModel, form, dat1; contrasts)
end
Est. | SE | z | p | σ_Subj | |
---|---|---|---|---|---|
(Intercept) | 389.7335 | 7.0691 | 55.13 | <1e-99 | 55.0256 |
CTR: sod | 33.7816 | 3.2802 | 10.30 | <1e-24 | 23.1865 |
CTR: dos | 47.7669 | 3.5513 | 13.45 | <1e-40 | 25.4919 |
CTR: dod | 45.0199 | 4.3542 | 10.34 | <1e-24 | 32.2138 |
Residual | 69.8355 |
We can simply relevel the factor or move the column with -1s for a different base.
This contrast corresponds almost to contr.sum()
in R.
m3 = let
contrasts = Dict(:CTR => EffectsCoding(; base="dod"))
fit(MixedModel, form, dat1; contrasts)
end
Est. | SE | z | p | σ_Subj | |
---|---|---|---|---|---|
(Intercept) | 389.7336 | 7.0908 | 54.96 | <1e-99 | 55.1953 |
CTR: dos | 16.1248 | 1.4403 | 11.20 | <1e-28 | 7.3308 |
CTR: sod | 2.1396 | 1.3338 | 1.60 | 0.1087 | 6.0080 |
CTR: val | -31.6422 | 2.6424 | -11.97 | <1e-32 | 19.9512 |
Residual | 69.8348 |
The “almost” qualification refers to the fact that contr.sum()
uses the last factor levels as defaul base level; EffectsCoding
uses the first level.
HelmertCoding
codes each level as the difference from the average of the lower levels. With the default order of CTR
levels we get the following test statistics. These contrasts are othogonal.
Est. | SE | z | p | σ_Subj | |
---|---|---|---|---|---|
(Intercept) | 389.7336 | 7.0931 | 54.95 | <1e-99 | 55.2134 |
CTR: dos | 1.3735 | 1.1071 | 1.24 | 0.2147 | 4.7544 |
CTR: sod | -4.2039 | 0.6843 | -6.14 | <1e-09 | 3.3491 |
CTR: val | -10.5474 | 0.8809 | -11.97 | <1e-32 | 6.6508 |
Residual | 69.8348 |
Reverse HelmertCoding
codes each level as the difference from the average of the higher levels. To estimate these effects we simply reverse the order of factor levels. Of course, the contrasts are also orthogonal.
m4b = let
levels = reverse(StatsModels.levels(dat1.CTR))
contrasts = Dict(:CTR => HelmertCoding(; levels))
fit(MixedModel, form, dat1; contrasts)
end
Est. | SE | z | p | σ_Subj | |
---|---|---|---|---|---|
(Intercept) | 389.7336 | 7.0931 | 54.95 | <1e-99 | 55.2134 |
CTR: dos | 1.3735 | 1.1071 | 1.24 | 0.2147 | 4.7544 |
CTR: sod | -4.2039 | 0.6843 | -6.14 | <1e-09 | 3.3491 |
CTR: val | -10.5474 | 0.8809 | -11.97 | <1e-32 | 6.6508 |
Residual | 69.8348 |
Factorial designs (i.e., lab experiments) are traditionally analyzed with analysis of variance. The test statistics of main effects and interactions are based on an orthogonal set of contrasts. We specify them with HypothesisCoding
.
An A(2) x B(2) design can be recast as an F(4) design with the levels (A1-B1, A1-B2, A2-B1, A2-B2). The following contrast specifiction returns estimates for the main effect of A, the main effect of B, and the interaction of A and B. In a figure With A on the x-axis and the levels of B shown as two lines, the interaction tests the null hypothesis that the two lines are parallel. A positive coefficient implies overadditivity (diverging lines toward the right) and a negative coefficient underadditivity (converging lines).
m5 = let
contrasts = Dict(
:CTR => HypothesisCoding(
[
-1 -1 +1 +1 # A
-1 +1 -1 +1 # B
+1 -1 -1 +1 # A x B
];
levels,
labels=["A", "B", "AxB"],
),
)
fit(MixedModel, form, dat1; contrasts)
end
Est. | SE | z | p | σ_Subj | |
---|---|---|---|---|---|
(Intercept) | 389.7336 | 7.0912 | 54.96 | <1e-99 | 55.1986 |
CTR: A | 59.0052 | 5.1827 | 11.38 | <1e-29 | 36.2091 |
CTR: B | 31.0348 | 4.6750 | 6.64 | <1e-10 | 31.7133 |
CTR: AxB | -36.5287 | 3.0927 | -11.81 | <1e-31 | 16.0046 |
Residual | 69.8348 |
It is also helpful to see the corresponding layout of the four means for the interaction of A and B (i.e., the third contrast)
B1 B2
A1 +1 -1
A2 -1 +1
Thus, interaction tests whether the difference between main diagonal and minor diagonal is different from zero.
Going beyond the four level factor; it is also helpful to see the corresponding layout of the eight means for the interaction of A and B and C.
C1 C2
B1 B2 B1 B2
A1 +1 -1 A1 -1 +1
A2 -1 +1 A2 +1 -1
TO BE DONE
Nested contrasts are often specified as follow up as post-hoc tests for ANOVA interactions. They are orthogonal. We specify them with HypothesisCoding
.
An A(2) x B(2) design can be recast as an F(4) design with the levels (A1-B1, A1-B2, A2-B1, A2-B2). The following contrast specifiction returns an estimate for the main effect of A and the effects of B nested in the two levels of A. In a figure With A on the x-axis and the levels of B shown as two lines, the second contrast tests whether A1-B1 is different from A1-B2 and the third contrast tests whether A2-B1 is different from A2-B2.
m8 = let
contrasts = Dict(
:CTR => HypothesisCoding(
[
-1 -1 +1 +1
-1 +1 0 0
0 0 +1 -1
];
levels,
labels=["do_so", "spt", "grv"],
),
)
fit(MixedModel, form, dat1; contrasts)
end
Est. | SE | z | p | σ_Subj | |
---|---|---|---|---|---|
(Intercept) | 389.7336 | 7.0908 | 54.96 | <1e-99 | 55.1961 |
CTR: do_so | 59.0051 | 5.1824 | 11.39 | <1e-29 | 36.2060 |
CTR: spt | 33.7817 | 3.2871 | 10.28 | <1e-24 | 23.2462 |
CTR: grv | 2.7470 | 2.2140 | 1.24 | 0.2147 | 9.5068 |
Residual | 69.8349 |
The three contrasts for one main effect and two nested contrasts are orthogonal. There is no test of the interaction (parallelism).
For factors with more than four levels there are many options for specifying orthogonal contrasts as long as one proceeds in a top-down strictly hiearchical fashion.
Suppose you have a factor with seven levels and let’s ignore shifting colummns. In this case, you have six options for the first contrast, that is 6 vs. 1, 5 vs.2 , 4 vs. 3, 3 vs. 4, 2 vs. 5, and 1 vs. 6 levels. Then, you specify orthogonal contrasts for partitions with more than 2 elements and so on. That is, you don’t specify a contrast that crosses an earlier partition line.
In the following example, after an initial 4 vs 3 partitioning of levels, we specify AnovaCoding
for the left and HelmertCoding
for the right partition.
contrasts = Dict(
:CTR => HypothesisCoding(
[
-1/4 -1/4 -1/4 -1/4 +1/3 +1/3 +1/3
-1/2 -1/2 +1/2 +1/2 0 0 0
-1/2 +1/2 -1/2 +1/2 0 0 0
+1/2 -1/2 -1/2 +1/2 0 0 0
0 0 0 0 -1 +1 0
0 0 0 0 -1/2 -1/2 1
];
levels=["A1", "A2", "A3", "A4", "A5", "A6", "A7"],
labels=["c567.1234", "B", "C", "BxC", "c6.5", "c6.56"],
),
);
There are two rules that hold for all orthogonal contrasts:
StatsModels.jl provides a few commonly used contrast coding schemes, some less-commonly used schemes, and structs that allow you to manually specify your own, custom schemes.
The most commonly used contrasts are DummyCoding
and EffectsCoding
(which are similar to contr.treatment()
and contr.sum()
in R, respectively).
We also provide HelmertCoding
and SeqDiffCoding
(corresponding to base R’s contr.helmert()
and MASS::contr.sdif()
).
ContrastsCoding()
There are two ways to manually specify contrasts. First, you can specify them directly via ContrastsCoding
. If you do, it’s good practice to specify the levels corresponding to the rows of the matrix, although they can be omitted in which case they’ll be inferred from the data.
HypothesisCoding()
A better way to specify manual contrasts is via HypothesisCoding
, where each row of the matrix corresponds to the weights given to the cell means of the levels corresponding to each column (see Schad et al. (2020) for more information).