Contrast Coding of Visual Attention Effects

Author

Phillip Alday, Douglas Bates, and Reinhold Kliegl

Published

2022-09-27

Code
using Arrow
using Chain
using DataFrames
using MixedModels
using ProgressMeter
using StatsBase
using StatsModels
using StatsModels: ContrastsCoding

ProgressMeter.ijulia_behavior(:clear);

A word of caution

For a (quasi-)experimental set of data, there is (or should be) a clear a priori theoretical committment to specific hypotheses about differences between factor levels and how these differences enter in interactions with other factors. This specification should be used in the first LMM and reported, irrespective of the outcome. If alternative theories lead to alternative a priori contrast specifications, both analyses are justified. If the observed means render the specification completely irrelevant, the comparisons originally planned could still be reported in a Supplement).

In this script, we are working through a large number of different contrasts for the same data. The purpose is to introduce both the preprogrammed (“canned”) and the general options to specify hypotheses about main effects and interactions. Obviously, we do not endorse generating a plot of the means and specifying the contrasts accordingly. This is known as the Texas sharpshooter fallacy. The link leads to an illustration and brief historical account by Wagenmakers (2018).

Irrespective of how results turn out, there is nothing wrong with specifying a set of post-hoc contrasts to gain a better understanding of what the data are trying to tell us. Of course, in an article or report about the study, the a priori and post-hoc nature of contrast specifications must be made clear. Some kind of alpha-level adjustment (e.g., Bonferroni) may be called for, too. And, of course, there are grey zones.

There is quite a bit of statistical literature on contrasts. Two “local” references are:

Brehm, L., & Alday, P. M., (2022). Contrast coding choices in a decade of mixed models. Journal of Memory and Language, 125, 104334.

Schad, D. J., Vasishth, S., Hohenstein, S., & Kliegl, R. (2020). How to capitalize on a priori contrasts in linear (mixed) models: A tutorial. Journal of Memory and Language, 110, 104038.

For further readings see “Further Readings” in Schad et al. (2020).

Example data

We take the KWDYZ dataset (Kliegl et al., 2010). This is an experiment looking at three effects of visual cueing under four different cue-target relations (CTRs). Two horizontal rectangles are displayed above and below a central fixation point or they displayed in vertical orientation to the left and right of the fixation point. Subjects react to the onset of a small visual target occuring at one of the four ends of the two rectangles. The target is cued validly on 70% of trials by a brief flash of the corner of the rectangle at which it appears; it is cued invalidly at the three other locations 10% of the trials each.

We specify three contrasts for the four-level factor CTR that are derived from spatial, object-based, and attractor-like features of attention. They map onto sequential differences between appropriately ordered factor levels.

We also have a dataset from a replication and extension of this study (Kliegl, Kuschela, & Laubrock, 2015). Both data sets are available in R-package RePsychLing

Preprocessing

dat1 = @chain "./data/kwdyz11.arrow" begin
  Arrow.Table
  DataFrame
  select!(:subj => :Subj, :tar => :CTR, :rt)
end
cellmeans = combine(
  groupby(dat1, [:CTR]),
  :rt => mean,
  :rt => std,
  :rt => length,
  :rt => (x -> std(x) / sqrt(length(x))) => :rt_semean,
)

4 rows × 5 columns

CTRrt_meanrt_stdrt_lengthrt_semean
StringFloat64Float64Int64Float64
1val358.03283.4581201410.588069
2sod391.26792.66228631.73177
3dos405.14692.689328431.73837
4dod402.395.391428631.78278

Julia contrast options

We use the same formula for all analyses

form = @formula rt ~ 1 + CTR + (1 + CTR | Subj)
FormulaTerm
Response:
  rt(unknown)
Predictors:
  1
  CTR(unknown)
  (CTR,Subj)->(1 + CTR) | Subj

This is the default order of factor levels.

StatsModels.levels(dat1.CTR)
4-element Vector{String}:
 "val"
 "sod"
 "dos"
 "dod"

Controlling the ordering of levels for contrasts:

  1. kwarg levels to order the levels
  2. The first level is set as the baseline; with kwarg base a different level can be specified.

SeqDiffCoding

The SeqDiffCoding contrast corresponds to MASS::contr.sdif() in R. The assignment of random factors such as Subj to Grouping() is necessary when the sample size is very large. We recommend to include it always, but in this tutorial we do so only in the first example.

form = @formula rt ~ 1 + CTR + (1 + CTR | Subj)
levels = ["val", "sod", "dos", "dod"]
m1 = let
  contrasts = Dict(
    :CTR => SeqDiffCoding(; levels), 
    :Subj => Grouping()
  )
  fit(MixedModel, form, dat1; contrasts)
end
Minimizing 202   Time: 0:00:00 ( 2.07 ms/it)
  objective:  325809.5493487461
Est. SE z p σ_Subj
(Intercept) 389.7336 7.0914 54.96 <1e-99 55.2001
CTR: sod 33.7817 3.2874 10.28 <1e-24 23.2485
CTR: dos 13.9852 2.3056 6.07 <1e-08 10.7520
CTR: dod -2.7470 2.2142 -1.24 0.2147 9.5093
Residual 69.8348

HypothesisCoding

HypothesisCoding is the most general option available. We can implement all “canned” contrasts ourselves. The next example reproduces the test statistcs from SeqDiffCoding - with a minor modification illustrating the flexibility of going beyond the default version.

m1b = let
  contrasts = Dict(
    :CTR => HypothesisCoding(
      [
        -1  1 0  0
         0 -1 1  0
         0  0 1 -1
      ];
      levels,
      labels=["spt", "obj", "grv"],
    ),
  )
  fit(MixedModel, form, dat1; contrasts)
end
Est. SE z p σ_Subj
(Intercept) 389.7336 7.0910 54.96 <1e-99 55.1977
CTR: spt 33.7817 3.2876 10.28 <1e-24 23.2499
CTR: obj 13.9852 2.3058 6.07 <1e-08 10.7556
CTR: grv 2.7470 2.2144 1.24 0.2148 9.5116
Residual 69.8348

The difference to the preprogrammed SeqDiffCoding is that for the third contrast we changed the direction of the contrast such that the sign of the effect is positive when the result is in agreement with theoretical expectation, that is we subtract the fourth level from the third, not the third level from the fourth.

DummyCoding

Thi contrast corresponds to contr.treatment() in R

m2 = let
  contrasts = Dict(:CTR => DummyCoding(; base="val"))
  fit(MixedModel, form, dat1; contrasts)
end
Est. SE z p σ_Subj
(Intercept) 358.0914 6.1516 58.21 <1e-99 47.8919
CTR: dod 45.0200 4.3644 10.32 <1e-24 32.2974
CTR: dos 47.7669 3.5573 13.43 <1e-40 25.5426
CTR: sod 33.7818 3.2886 10.27 <1e-24 23.2586
Residual 69.8348

The DummyCoding contrast has the disadvantage that the intercept returns the mean of the level specified as base, default is the first level, not the GM.

YchycaeitCoding

The contrasts returned by DummyCoding may be exactly what we want. Can’t we have them, but also have the intercept estimate the GM, rather than the mean of the base level? Yes, we can! We call this “You can have your cake and it eat, too”-Coding (YchycaeitCoding). And we use HpothesisCoding to achieve this outcome.

m2b = let
  contrasts = Dict(
    :CTR => HypothesisCoding(
      [
        -1 1 0 0
        -1 0 1 0
        -1 0 0 1
      ];
      levels,
    )
  )
  fit(MixedModel, form, dat1; contrasts)
end
Est. SE z p σ_Subj
(Intercept) 389.7335 7.0691 55.13 <1e-99 55.0256
CTR: sod 33.7816 3.2802 10.30 <1e-24 23.1865
CTR: dos 47.7669 3.5513 13.45 <1e-40 25.4919
CTR: dod 45.0199 4.3542 10.34 <1e-24 32.2138
Residual 69.8355

We can simply relevel the factor or move the column with -1s for a different base.

EffectsCoding

This contrast corresponds almost to contr.sum() in R.

m3 = let
  contrasts = Dict(:CTR => EffectsCoding(; base="dod"))
  fit(MixedModel, form, dat1; contrasts)
end
Est. SE z p σ_Subj
(Intercept) 389.7336 7.0908 54.96 <1e-99 55.1953
CTR: dos 16.1248 1.4403 11.20 <1e-28 7.3308
CTR: sod 2.1396 1.3338 1.60 0.1087 6.0080
CTR: val -31.6422 2.6424 -11.97 <1e-32 19.9512
Residual 69.8348

The “almost” qualification refers to the fact that contr.sum() uses the last factor levels as defaul base level; EffectsCoding uses the first level.

HelmertCoding

HelmertCoding codes each level as the difference from the average of the lower levels. With the default order of CTR levels we get the following test statistics. These contrasts are othogonal.

m4 = let
  contrasts = Dict(:CTR => HelmertCoding())
  fit(MixedModel, form, dat1; contrasts)
end
Est. SE z p σ_Subj
(Intercept) 389.7336 7.0931 54.95 <1e-99 55.2134
CTR: dos 1.3735 1.1071 1.24 0.2147 4.7544
CTR: sod -4.2039 0.6843 -6.14 <1e-09 3.3491
CTR: val -10.5474 0.8809 -11.97 <1e-32 6.6508
Residual 69.8348
+ HeC1: (2 - 1)/2           # (391 - 358)/2
+ HeC2: (3 - (2+1)/2)/3     # (405 - (391 + 358)/2)/3
+ HeC3: (4 - (3+2+1)/3)/4   # (402 - (405 + 391 + 358)/3)/4

Reverse HelmertCoding

Reverse HelmertCoding codes each level as the difference from the average of the higher levels. To estimate these effects we simply reverse the order of factor levels. Of course, the contrasts are also orthogonal.

m4b = let
  levels = reverse(StatsModels.levels(dat1.CTR))
  contrasts = Dict(:CTR => HelmertCoding(; levels))
  fit(MixedModel, form, dat1; contrasts)
end
Est. SE z p σ_Subj
(Intercept) 389.7336 7.0931 54.95 <1e-99 55.2134
CTR: dos 1.3735 1.1071 1.24 0.2147 4.7544
CTR: sod -4.2039 0.6843 -6.14 <1e-09 3.3491
CTR: val -10.5474 0.8809 -11.97 <1e-32 6.6508
Residual 69.8348
+ HeC1:(3 - 4)/2            # (405 - 402)/2
+ HeC2:(2 - (3+4)/2)/3      # (391 - (405 + 402)/2)/3
+ HeC3:(1 - (2+3+4)/3/4     # (356  -(391 + 405 + 402)/3)/4 

Anova Coding

Factorial designs (i.e., lab experiments) are traditionally analyzed with analysis of variance. The test statistics of main effects and interactions are based on an orthogonal set of contrasts. We specify them with HypothesisCoding.

A(2) x B(2)

An A(2) x B(2) design can be recast as an F(4) design with the levels (A1-B1, A1-B2, A2-B1, A2-B2). The following contrast specifiction returns estimates for the main effect of A, the main effect of B, and the interaction of A and B. In a figure With A on the x-axis and the levels of B shown as two lines, the interaction tests the null hypothesis that the two lines are parallel. A positive coefficient implies overadditivity (diverging lines toward the right) and a negative coefficient underadditivity (converging lines).

m5 = let
  contrasts = Dict(
    :CTR => HypothesisCoding(
      [
        -1 -1 +1 +1          # A
        -1 +1 -1 +1          # B
        +1 -1 -1 +1          # A x B
      ];
      levels,
      labels=["A", "B", "AxB"],
    ),
  )
  fit(MixedModel, form, dat1; contrasts)
end
Est. SE z p σ_Subj
(Intercept) 389.7336 7.0912 54.96 <1e-99 55.1986
CTR: A 59.0052 5.1827 11.38 <1e-29 36.2091
CTR: B 31.0348 4.6750 6.64 <1e-10 31.7133
CTR: AxB -36.5287 3.0927 -11.81 <1e-31 16.0046
Residual 69.8348

It is also helpful to see the corresponding layout of the four means for the interaction of A and B (i.e., the third contrast)

        B1     B2
   A1   +1     -1
   A2   -1     +1

Thus, interaction tests whether the difference between main diagonal and minor diagonal is different from zero.

A(2) x B(2) x C(2)

Going beyond the four level factor; it is also helpful to see the corresponding layout of the eight means for the interaction of A and B and C.

          C1              C2
      B1     B2        B1     B2
 A1   +1     -1   A1   -1     +1
 A2   -1     +1   A2   +1     -1

A(2) x B(2) x C(3)

TO BE DONE

Nested coding

Nested contrasts are often specified as follow up as post-hoc tests for ANOVA interactions. They are orthogonal. We specify them with HypothesisCoding.

An A(2) x B(2) design can be recast as an F(4) design with the levels (A1-B1, A1-B2, A2-B1, A2-B2). The following contrast specifiction returns an estimate for the main effect of A and the effects of B nested in the two levels of A. In a figure With A on the x-axis and the levels of B shown as two lines, the second contrast tests whether A1-B1 is different from A1-B2 and the third contrast tests whether A2-B1 is different from A2-B2.

m8 = let
  contrasts = Dict(
    :CTR => HypothesisCoding(
      [
        -1 -1 +1 +1
        -1 +1  0  0
         0  0 +1 -1
      ];
      levels,
      labels=["do_so", "spt", "grv"],
    ),
  )
  fit(MixedModel, form, dat1; contrasts)
end
Est. SE z p σ_Subj
(Intercept) 389.7336 7.0908 54.96 <1e-99 55.1961
CTR: do_so 59.0051 5.1824 11.39 <1e-29 36.2060
CTR: spt 33.7817 3.2871 10.28 <1e-24 23.2462
CTR: grv 2.7470 2.2140 1.24 0.2147 9.5068
Residual 69.8349

The three contrasts for one main effect and two nested contrasts are orthogonal. There is no test of the interaction (parallelism).

Other orthogonal contrasts

For factors with more than four levels there are many options for specifying orthogonal contrasts as long as one proceeds in a top-down strictly hiearchical fashion.

Suppose you have a factor with seven levels and let’s ignore shifting colummns. In this case, you have six options for the first contrast, that is 6 vs. 1, 5 vs.2 , 4 vs. 3, 3 vs. 4, 2 vs. 5, and 1 vs. 6 levels. Then, you specify orthogonal contrasts for partitions with more than 2 elements and so on. That is, you don’t specify a contrast that crosses an earlier partition line.

In the following example, after an initial 4 vs 3 partitioning of levels, we specify AnovaCoding for the left and HelmertCoding for the right partition.

contrasts = Dict(
  :CTR => HypothesisCoding(
    [
      -1/4 -1/4 -1/4 -1/4 +1/3 +1/3 +1/3
      -1/2 -1/2 +1/2 +1/2    0    0    0
      -1/2 +1/2 -1/2 +1/2    0    0    0
      +1/2 -1/2 -1/2 +1/2    0    0    0
         0    0    0    0   -1   +1    0
         0    0    0    0 -1/2 -1/2    1
    ];
    levels=["A1", "A2", "A3", "A4", "A5", "A6", "A7"],
    labels=["c567.1234", "B", "C", "BxC", "c6.5", "c6.56"],
  ),
);

There are two rules that hold for all orthogonal contrasts:

  1. The weights within rows sum to zero.
  2. For all pairs of rows, the sum of the products of weights in the same columns sums to zero.

Appendix: Summary (Dave Kleinschmidt)

StatsModels

StatsModels.jl provides a few commonly used contrast coding schemes, some less-commonly used schemes, and structs that allow you to manually specify your own, custom schemes.

Standard contrasts

The most commonly used contrasts are DummyCoding and EffectsCoding (which are similar to contr.treatment() and contr.sum() in R, respectively).

“Exotic” contrasts (rk: well …)

We also provide HelmertCoding and SeqDiffCoding (corresponding to base R’s contr.helmert() and MASS::contr.sdif()).

Manual contrasts

ContrastsCoding()

There are two ways to manually specify contrasts. First, you can specify them directly via ContrastsCoding. If you do, it’s good practice to specify the levels corresponding to the rows of the matrix, although they can be omitted in which case they’ll be inferred from the data.

HypothesisCoding()

A better way to specify manual contrasts is via HypothesisCoding, where each row of the matrix corresponds to the weights given to the cell means of the levels corresponding to each column (see Schad et al. (2020) for more information).

References

Kliegl, R., Wei, P., Dambacher, M., Yan, M., & Zhou, X. (2010). Experimental effects and individual differences in linear mixed models: Estimating the relationship between spatial, object, and attraction effects in visual attention. Frontiers in Psychology. https://doi.org/10.3389/fpsyg.2010.00238
Schad, D. J., Vasishth, S., Hohenstein, S., & Kliegl, R. (2020). How to capitalize on a priori contrasts in linear (mixed) models: A tutorial. Journal of Memory and Language, 110, 104038. https://doi.org/10.1016/j.jml.2019.104038