Contrast Coding of Visual Attention Effects

Author

Phillip Alday, Douglas Bates, and Reinhold Kliegl

Published

2024-06-27

Code
using Chain
using DataFrames
using MixedModels
using SMLP2023: dataset
using StatsBase
using StatsModels

import ProgressMeter
ProgressMeter.ijulia_behavior(:clear);

1 A word of caution

For a (quasi-)experimental set of data, there is (or should be) a clear a priori theoretical commitment to specific hypotheses about differences between factor levels and how these differences enter in interactions with other factors. This specification should be used in the first LMM and reported, irrespective of the outcome. If alternative theories lead to alternative a priori contrast specifications, both analyses are justified. If the observed means render the specification completely irrelevant, the comparisons originally planned could still be reported in a Supplement).

In this script, we are working through a large number of different contrasts for the same data. The purpose is to introduce both the preprogrammed (“canned”) and the general options to specify hypotheses about main effects and interactions. Obviously, we do not endorse generating a plot of the means and specifying the contrasts accordingly. This is known as the Texas sharpshooter fallacy. The link leads to an illustration and brief historical account by Wagenmakers (2018).

Irrespective of how results turn out, there is nothing wrong with specifying a set of post-hoc contrasts to gain a better understanding of what the data are trying to tell us. Of course, in an article or report about the study, the a priori and post-hoc nature of contrast specifications must be made clear. Some kind of alpha-level adjustment (e.g., Bonferroni) may be called for, too. And, of course, there are grey zones.

There is quite a bit of statistical literature on contrasts. Two “local” references are Brehm & Alday (2022) and Schad et al. (2020).

For further readings see “Further Readings” in Schad et al. (2020).

2 Example data

We take the KWDYZ dataset from Kliegl et al. (2011). This is an experiment looking at three effects of visual cueing under four different cue-target relations (CTRs). Two horizontal rectangles are displayed above and below a central fixation point or they displayed in vertical orientation to the left and right of the fixation point. Subjects react to the onset of a small visual target occurring at one of the four ends of the two rectangles. The target is cued validly on 70% of trials by a brief flash of the corner of the rectangle at which it appears; it is cued invalidly at the three other locations 10% of the trials each.

We specify three contrasts for the four-level factor CTR that are derived from spatial, object-based, and attractor-like features of attention. They map onto sequential differences between appropriately ordered factor levels.

We also have a dataset from a replication and extension of this study Kliegl et al. (2015) Both data sets are available in R-package RePsychLing

3 Preprocessing

dat1 = DataFrame(dataset(:kwdyz11))
cellmeans = combine(
  groupby(dat1, [:CTR]),
  :rt => mean,
  :rt => std,
  :rt => length,
  :rt => (x -> std(x) / sqrt(length(x))) => :rt_semean,
)
4×5 DataFrame
Row CTR rt_mean rt_std rt_length rt_semean
String Float32 Float32 Int64 Float64
1 val 358.032 83.4579 20141 0.588067
2 sod 391.267 92.662 2863 1.73177
3 dos 405.146 92.6892 2843 1.73836
4 dod 402.3 95.3913 2863 1.78278

4 Julia contrast options

We use the same formula for all analyses

form = @formula rt ~ 1 + CTR + (1 + CTR | Subj)

This is the default order of factor levels.

show(StatsModels.levels(dat1.CTR))
["val", "sod", "dos", "dod"]

Controlling the ordering of levels for contrasts:

  1. kwarg levels to order the levels
  2. The first level is set as the baseline; with kwarg base a different level can be specified.

4.1 SeqDiffCoding

The SeqDiffCoding contrast corresponds to MASS::contr.sdif() in R. The assignment of random factors such as Subj to Grouping() is necessary when the sample size is very large. We recommend to include it always, but in this tutorial we do so only in the first example.

m1 = let levels = ["val", "sod", "dos", "dod"]
  contrasts = Dict(
    :CTR => SeqDiffCoding(; levels),
    :Subj => Grouping()
  )
  fit(MixedModel, form, dat1; contrasts)
end
Minimizing 197    Time: 0:00:00 ( 1.70 ms/it)
  objective:  325809.5493080511
Est. SE z p σ_Subj
(Intercept) 389.7336 7.0909 54.96 <1e-99 55.1962
CTR: sod 33.7817 3.2873 10.28 <1e-24 23.2478
CTR: dos 13.9852 2.3055 6.07 <1e-08 10.7515
CTR: dod -2.7470 2.2138 -1.24 0.2147 9.5042
Residual 69.8349

4.2 HypothesisCoding

HypothesisCoding is the most general option available. We can implement all “canned” contrasts ourselves. The next example reproduces the test statistcs from SeqDiffCoding - with a minor modification illustrating the flexibility of going beyond the default version.

m1b = let levels = ["val", "sod", "dos", "dod"]
  contrasts = Dict(
    :CTR => HypothesisCoding(
      [
        -1  1 0  0
         0 -1 1  0
         0  0 1 -1
      ];
      levels,
      labels=["spt", "obj", "grv"],
    ),
  )
  fit(MixedModel, form, dat1; contrasts)
end
Est. SE z p σ_Subj
(Intercept) 389.7336 7.0913 54.96 <1e-99 55.1997
CTR: spt 33.7817 3.2874 10.28 <1e-24 23.2481
CTR: obj 13.9852 2.3057 6.07 <1e-08 10.7533
CTR: grv 2.7469 2.2143 1.24 0.2148 9.5105
Residual 69.8348

The difference to the preprogrammed SeqDiffCoding is that for the third contrast we changed the direction of the contrast such that the sign of the effect is positive when the result is in agreement with theoretical expectation, that is we subtract the fourth level from the third, not the third level from the fourth.

4.3 DummyCoding

This contrast corresponds to contr.treatment() in R

m2 = let
  contrasts = Dict(:CTR => DummyCoding(; base="val"))
  fit(MixedModel, form, dat1; contrasts)
end
Est. SE z p σ_Subj
(Intercept) 358.0914 6.1545 58.18 <1e-99 47.9139
CTR: dod 45.0200 4.3636 10.32 <1e-24 32.2912
CTR: dos 47.7669 3.5566 13.43 <1e-40 25.5367
CTR: sod 33.7817 3.2874 10.28 <1e-24 23.2481
Residual 69.8348

The DummyCoding contrast has the disadvantage that the intercept returns the mean of the level specified as base, default is the first level, not the GM.

4.4 YchycaeitCoding

The contrasts returned by DummyCoding may be exactly what we want. Can’t we have them, but also have the intercept estimate the GM, rather than the mean of the base level? Yes, we can! We call this “You can have your cake and it eat, too”-Coding (YchycaeitCoding). And we use HypothesisCoding to achieve this outcome.

m2b = let levels = ["val", "sod", "dos", "dod"]
  contrasts = Dict(
    :CTR => HypothesisCoding(
      [
        -1 1 0 0
        -1 0 1 0
        -1 0 0 1
      ];
      levels,
      labels=levels[2:end],
    )
  )
  fit(MixedModel, form, dat1; contrasts)
end
Est. SE z p σ_Subj
(Intercept) 389.7336 7.0918 54.96 <1e-99 55.2039
CTR: sod 33.7817 3.2877 10.28 <1e-24 23.2512
CTR: dos 47.7669 3.5567 13.43 <1e-40 25.5374
CTR: dod 45.0200 4.3637 10.32 <1e-24 32.2919
Residual 69.8349

We can simply relevel the factor or move the column with -1s for a different base.

m2c = let levels = ["val", "sod", "dos", "dod"]
  contrasts = Dict(
    :CTR => HypothesisCoding(
      [
        -1/2 1/2   0   0
        -1/2   0 1/2   0
        -1/2   0   0  1/2
      ];
      levels,
      labels=levels[2:end],
    )
  )
  fit(MixedModel, form, dat1; contrasts)
end
Est. SE z p σ_Subj
(Intercept) 389.7336 7.0909 54.96 <1e-99 55.1963
CTR: sod 16.8909 1.6436 10.28 <1e-24 11.6237
CTR: dos 23.8835 1.7782 13.43 <1e-40 12.7676
CTR: dod 22.5100 2.1817 10.32 <1e-24 16.1452
Residual 69.8348

We can simply relevel the factor or move the column with -1s for a different base.

4.5 EffectsCoding

This contrast corresponds almost to contr.sum() in R.

m3 = let
  contrasts = Dict(:CTR => EffectsCoding(; base="dod"))
  fit(MixedModel, form, dat1; contrasts)
end
Est. SE z p σ_Subj
(Intercept) 389.7336 7.1012 54.88 <1e-99 55.2772
CTR: dos 16.1247 1.4410 11.19 <1e-28 7.3387
CTR: sod 2.1396 1.3352 1.60 0.1091 6.0268
CTR: val -31.6422 2.6439 -11.97 <1e-32 19.9638
Residual 69.8343

The “almost” qualification refers to the fact that contr.sum() uses the last factor levels as default base level; EffectsCoding uses the first level.

m3b = let levels = [ "dod", "val", "sod", "dos"]
  contrasts = Dict(
    :CTR => HypothesisCoding(
      [
         -1/4   3/4 -1/4  -1/4
         -1/4  -1/4  3/4  -1/4
         -1/4  -1/4 -1/4   3/4
      ];
      levels,
      labels=levels[2:end],
    )
  )
  fit(MixedModel, form, dat1; contrasts)
end
Est. SE z p σ_Subj
(Intercept) 389.7336 7.0905 54.97 <1e-99 55.1933
CTR: val -31.6422 2.6424 -11.97 <1e-32 19.9515
CTR: sod 2.1396 1.3339 1.60 0.1087 6.0089
CTR: dos 16.1248 1.4402 11.20 <1e-28 7.3291
Residual 69.8349
m3c = let levels = [ "dod", "val", "sod", "dos"]
  contrasts = Dict(
    :CTR => HypothesisCoding(
      [
         -1/2   3/2 -1/2  -1/2
         -1/2  -1/2  3/2  -1/2
         -1/2  -1/2 -1/2   3/2
      ];
      levels,
      labels=levels[2:end],
    )
  )
  fit(MixedModel, form, dat1; contrasts)
end
Est. SE z p σ_Subj
(Intercept) 389.7336 7.0900 54.97 <1e-99 55.1893
CTR: val -63.2843 5.2844 -11.98 <1e-32 39.8997
CTR: sod 4.2792 2.6679 1.60 0.1087 12.0195
CTR: dos 32.2495 2.8809 11.19 <1e-28 14.6648
Residual 69.8348

4.6 HelmertCoding

HelmertCoding codes each level as the difference from the average of the lower levels. With the default order of CTR levels we get the following test statistics. These contrasts are othogonal.

m4 = let
  contrasts = Dict(:CTR => HelmertCoding())
  fit(MixedModel, form, dat1; contrasts)
end
Est. SE z p σ_Subj
(Intercept) 389.7336 7.0913 54.96 <1e-99 55.1998
CTR: dos 1.3735 1.1070 1.24 0.2147 4.7539
CTR: sod -4.2039 0.6843 -6.14 <1e-09 3.3491
CTR: val -10.5474 0.8808 -11.98 <1e-32 6.6502
Residual 69.8348
+ HeC1: (2 - 1)/2           # (391 - 358)/2
+ HeC2: (3 - (2+1)/2)/3     # (405 - (391 + 358)/2)/3
+ HeC3: (4 - (3+2+1)/3)/4   # (402 - (405 + 391 + 358)/3)/4

4.7 Reverse HelmertCoding

Reverse HelmertCoding codes each level as the difference from the average of the higher levels. To estimate these effects we simply reverse the order of factor levels. Of course, the contrasts are also orthogonal.

m4b = let levels = reverse(StatsModels.levels(dat1.CTR))
  contrasts = Dict(:CTR => HelmertCoding(; levels))
  fit(MixedModel, form, dat1; contrasts)
end
Est. SE z p σ_Subj
(Intercept) 389.7336 7.0913 54.96 <1e-99 55.1998
CTR: dos 1.3735 1.1070 1.24 0.2147 4.7539
CTR: sod -4.2039 0.6843 -6.14 <1e-09 3.3491
CTR: val -10.5474 0.8808 -11.98 <1e-32 6.6502
Residual 69.8348
+ HeC1:(3 - 4)/2            # (405 - 402)/2
+ HeC2:(2 - (3+4)/2)/3      # (391 - (405 + 402)/2)/3
+ HeC3:(1 - (2+3+4)/3/4     # (356  -(391 + 405 + 402)/3)/4

4.8 Anova Coding

Factorial designs (i.e., lab experiments) are traditionally analyzed with analysis of variance. The test statistics of main effects and interactions are based on an orthogonal set of contrasts. We specify them with HypothesisCoding.

4.8.1 A(2) x B(2)

An A(2) x B(2) design can be recast as an F(4) design with the levels (A1-B1, A1-B2, A2-B1, A2-B2). The following contrast specification returns estimates for the main effect of A, the main effect of B, and the interaction of A and B. In a figure With A on the x-axis and the levels of B shown as two lines, the interaction tests the null hypothesis that the two lines are parallel. A positive coefficient implies overadditivity (diverging lines toward the right) and a negative coefficient underadditivity (converging lines).

m5 = let levels = ["val", "sod", "dos", "dod"]
  contrasts = Dict(
    :CTR => HypothesisCoding(
      [
        -1 -1 +1 +1          # A
        -1 +1 -1 +1          # B
        +1 -1 -1 +1          # A x B
      ];
      levels,
      labels=["A", "B", "AxB"],
    ),
  )
  fit(MixedModel, form, dat1; contrasts)
end
Est. SE z p σ_Subj
(Intercept) 389.7336 7.0917 54.96 <1e-99 55.2028
CTR: A 59.0052 5.1827 11.39 <1e-29 36.2088
CTR: B 31.0348 4.6753 6.64 <1e-10 31.7164
CTR: AxB -36.5287 3.0931 -11.81 <1e-31 16.0093
Residual 69.8348

It is also helpful to see the corresponding layout of the four means for the interaction of A and B (i.e., the third contrast)

        B1     B2
   A1   +1     -1
   A2   -1     +1

Thus, interaction tests whether the difference between main diagonal and minor diagonal is different from zero.

4.8.2 A(2) x B(2) x C(2)

Going beyond the four level factor; it is also helpful to see the corresponding layout of the eight means for the interaction of A and B and C.

          C1              C2
      B1     B2        B1     B2
 A1   +1     -1   A1   -1     +1
 A2   -1     +1   A2   +1     -1

4.8.3 A(2) x B(2) x C(3)

TO BE DONE

4.9 Nested coding

Nested contrasts are often specified as follow up as post-hoc tests for ANOVA interactions. They are orthogonal. We specify them with HypothesisCoding.

An A(2) x B(2) design can be recast as an F(4) design with the levels (A1-B1, A1-B2, A2-B1, A2-B2). The following contrast specification returns an estimate for the main effect of A and the effects of B nested in the two levels of A. In a figure With A on the x-axis and the levels of B shown as two lines, the second contrast tests whether A1-B1 is different from A1-B2 and the third contrast tests whether A2-B1 is different from A2-B2.

m8 = let levels = ["val", "sod", "dos", "dod"]
  contrasts = Dict(
    :CTR => HypothesisCoding(
      [
        -1 -1 +1 +1
        -1 +1  0  0
         0  0 +1 -1
      ];
      levels,
      labels=["do_so", "spt", "grv"],
    ),
  )
  fit(MixedModel, form, dat1; contrasts)
end
Est. SE z p σ_Subj
(Intercept) 389.7336 7.0914 54.96 <1e-99 55.2003
CTR: do_so 59.0052 5.1825 11.39 <1e-29 36.2074
CTR: spt 33.7817 3.2873 10.28 <1e-24 23.2474
CTR: grv 2.7470 2.2142 1.24 0.2148 9.5101
Residual 69.8348

The three contrasts for one main effect and two nested contrasts are orthogonal. There is no test of the interaction (parallelism).

5 Other orthogonal contrasts

For factors with more than four levels there are many options for specifying orthogonal contrasts as long as one proceeds in a top-down strictly hiearchical fashion.

Suppose you have a factor with seven levels and let’s ignore shifting columns. In this case, you have six options for the first contrast, that is 6 vs. 1, 5 vs.2 , 4 vs. 3, 3 vs. 4, 2 vs. 5, and 1 vs. 6 levels. Then, you specify orthogonal contrasts for partitions with more than 2 elements and so on. That is, you don’t specify a contrast that crosses an earlier partition line.

In the following example, after an initial 4 vs 3 partitioning of levels, we specify AnovaCoding for the left and HelmertCoding for the right partition.

contrasts = Dict(
  :CTR => HypothesisCoding(
    [
      -1/4 -1/4 -1/4 -1/4 +1/3 +1/3 +1/3
      -1/2 -1/2 +1/2 +1/2    0    0    0
      -1/2 +1/2 -1/2 +1/2    0    0    0
      +1/2 -1/2 -1/2 +1/2    0    0    0
         0    0    0    0   -1   +1    0
         0    0    0    0 -1/2 -1/2    1
    ];
    levels=["A1", "A2", "A3", "A4", "A5", "A6", "A7"],
    labels=["c567.1234", "B", "C", "BxC", "c6.5", "c6.56"],
  ),
);

There are two rules that hold for all orthogonal contrasts:

  1. The weights within rows sum to zero.
  2. For all pairs of rows, the sum of the products of weights in the same columns sums to zero.

6 Appendix: Summary (Dave Kleinschmidt)

StatsModels

StatsModels.jl provides a few commonly used contrast coding schemes, some less-commonly used schemes, and structs that allow you to manually specify your own, custom schemes.

6.1 Standard contrasts

The most commonly used contrasts are DummyCoding and EffectsCoding (which are similar to contr.treatment() and contr.sum() in R, respectively).

6.2 “Exotic” contrasts (rk: well …)

We also provide HelmertCoding and SeqDiffCoding (corresponding to base R’s contr.helmert() and MASS::contr.sdif()).

6.3 Manual contrasts

ContrastsCoding()

There are two ways to manually specify contrasts. First, you can specify them directly via ContrastsCoding. If you do, it’s good practice to specify the levels corresponding to the rows of the matrix, although they can be omitted in which case they’ll be inferred from the data.

HypothesisCoding()

A better way to specify manual contrasts is via HypothesisCoding, where each row of the matrix corresponds to the weights given to the cell means of the levels corresponding to each column (see Schad et al. (2020) for more information).

Back to top

References

Brehm, L., & Alday, P. M. (2022). Contrast coding choices in a decade of mixed models. Journal of Memory and Language, 125, 104334. https://doi.org/10.1016/j.jml.2022.104334
Kliegl, R., Kushela, J., & Laubrock, J. (2015). Object orientation and target size modulate the speed of visual attention. Department of Psychology, University of Potsdam.
Kliegl, R., Wei, P., Dambacher, M., Yan, M., & Zhou, X. (2011). Experimental effects and individual differences in linear mixed models: Estimating the relationship between spatial, object, and attraction effects in visual attention. Frontiers in Psychology. https://doi.org/10.3389/fpsyg.2010.00238
Schad, D. J., Vasishth, S., Hohenstein, S., & Kliegl, R. (2020). How to capitalize on a priori contrasts in linear (mixed) models: A tutorial. Journal of Memory and Language, 110, 104038. https://doi.org/10.1016/j.jml.2019.104038