Contents

Introduction

In the within-subject differences and factorial differences worksheets, we looked at Bayesian ANOVA methods. Throughout Research Methods in R we focus on Bayesian, rather than traditional (“p value”) methods, because Bayes Factors are more useful and easier to understand. However, psychologists have been using traditional ANOVA methods since the 1970s, and it was not until the 2010s that we started using Bayesian methods. So, there’s a lot of older work out there that uses a different technique. For this reason, it is probably still worth knowing how to do a traditional ANOVA. This is what we’ll cover in this worksheet.

The following assumes you have completed the within-subject differences and factorial differences worksheets. Make sure you are in your R project that contains the data downloaded from the git repository - see the preprocessing worksheet for details. Create a new R script called afex.R and enter all comments and commands from this worksheet into it.

Getting started

To do traditional ANVOA in R, you will need to load the afex package.

Enter this comment and command into your script, and run them:

# Load 'afex' package, for NHST ANOVA
library(afex)

You’ll also need to load the data, preprocess it, and set factors, as we did for Bayesian ANOVA. Here’s how (enter these comments and commands into your script, and run them):

# Load tidyverse
library(tidyverse)
# Load data
words <- read_csv("wordnaming2.csv")
# Produce subject-level summary
wordsum <-
    words %>%
    group_by(subj, medit) %>%
    summarise(rt = mean(rt))
# Set factors
wordsum$subj <- factor(wordsum$subj)
wordsum$medit <- factor(wordsum$medit)

One between-subject factor

To do a traditional, one between-subject factor ANOVA, use the following command.

Enter this comment and command into your script, and run it:

# One b/subj factor ANOVA
aov_car(formula = rt ~ medit + Error(subj), data = wordsum)
Contrasts set to contr.sum for the following variables: medit
Anova Table (Type 3 tests)

Response: rt
  Effect     df     MSE         F  ges p.value
1  medit 2, 417 3812.97 11.11 *** .051   <.001
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '+' 0.1 ' ' 1

Explanation of command

The command aov_car is similar to the command anovaBF, but for traditional ANOVA. In most ways, this command works like anovaBF. For example, the part formula = rt ~ medit says we want to look at the effect of the medit (meditation) variable on the rt (reaction time) variable. The next bit, + Error(subj) is also very similar to anovaBF - it tells aov_car which column contains the participant IDs (in this case, subj). The final part, data = wordsum tells the command where to find the data (just as with anovaBF).

Explanation of output

Although there’s quite a lot in this output, the main thing to focus on is the number underneath p.value. In this case, the number is ‘<.0001’. So, the p-value in this case is less than .0001, written as p < .0001. Recall that p values are widely misinterpreted by psychologists, but we have a convention that if the p value is less than .05, then people will believe there is a difference. In this case, the p value is less than .05, and so people will believe there is a difference.

If you are reporting the results of a traditional ANOVA in a journal article, you are generally expected to report the F value, along with the degrees of freedom (df), as well as the p value. So, in this case, you would write:

The two groups differed significantly, F(2, 417) = 11.11, p < .0001.

The F ratio and degrees of freedom are not particularly meaningful or useful information for the reader to have, but nonetheless journals normally require them when reporting traditional ANOVA. For further explanation, see the more on ANOVA worksheet.

One useful piece of information aov_car provides is ges - which is .04 in this example. ges stands for “generalized eta-squared”. This is a measure of effect size, somewhat like Cohen’s d, but the scale is different. ges is much like a correlation co-efficient, and ranges between 0 and 1. A large effect size for ges is around .26, a medium-sized effect is around .13, and a small effect is around .02 (further details here). It can be useful to report generalized eta squared, and this is reported as \(\eta_{g}^{2}\). So one could write:

The two groups did not differ significantly, F(1, 58) = 2.62, p = .11, \(\eta_{g}^{2}\) = .04.

Note that some authors will instead report a related measure called partial eta-squared (\(\eta_{p}^{2}\)). This is not the same thing and, in most circumstances, it is better to report generalized eta-squared. This point is discussed further here.

One within-subject factor

First, preprocess the data for this example.

Enter these comments and commands into your script, and run them:

# Load data
words <- read_csv("wordnaming2.csv")
# Select control condition, remove neutral condition
wordctrl <- words %>% filter(medit == "control")
wordctrlCI <- wordctrl %>% filter(congru != "neutral")
# Create subject-level summary
wordctrlCIsum <- wordctrlCI %>%
    group_by(subj, congru) %>%
    summarise(rt = mean(rt))
# Make factors
wordctrlCIsum$congru <- factor(wordctrlCIsum$congru)
wordctrlCIsum$subj <- factor(wordctrlCIsum$subj)

Now, this is the command to run a traditional within-subjects ANOVA (Enter this comment and command into your script, and run it):

# One-factor, w/subj ANOVA
aov_car(formula = rt ~ Error(subj/congru), data = wordctrlCIsum)
Anova Table (Type 3 tests)

Response: rt
  Effect     df     MSE         F  ges p.value
1 congru 1, 139 4949.44 44.78 *** .109   <.001
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '+' 0.1 ' ' 1

Explanation of command

As before, aov_car works similarly to anovaBF - for example we specify a formula and the data set (data). It differs mainly in how the right-hand side of the formula (everything after the ~) is laid out. To do a within-subjects test in aov_car we write Error(x/y), where x is the column of data frame that contains the subject IDs (subj in this case), and y is the column of the data frame that contains the within-subjects condition (congru in this case).

Explanation of output

The output can be read exactly the same way as in the last example. When reporting this analysis, one would write:

The two conditions differed significantly, F(1, 139) = 44.78, p < .0001, \(\eta_{g}^{2}\) = .11.

More than two levels

First, preprocess the data for this example.

Enter these commands into your script, and run them:

# Create subject-level summary
wordctrlsum <- wordctrl %>%
    group_by(subj, congru) %>%
    summarise(rt = mean(rt))
# Create factors
wordctrlsum$congru <- factor(wordctrlsum$congru)
wordctrlsum$subj <- factor(wordctrlsum$subj)

The command to run a traditional ANOVA with more than two levels is exactly the same as it is with two levels. For example, the within-subjects version is (Enter this command into your script, and run it):

# One-factor, w/subj, more than two levels
aov_car(formula = rt ~ Error(subj/congru), data = wordctrlsum)
Anova Table (Type 3 tests)

Response: rt
  Effect           df     MSE         F  ges p.value
1 congru 1.05, 145.44 4842.14 43.75 *** .085   <.001
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '+' 0.1 ' ' 1

Sphericity correction method: GG 

Explanation of output

The output is read exactly the same way as before. In this example, you would write:

Congruency significantly affected reaction time, F(1.05, 145.55) = 43.75, p < .0001, \(\eta_{g}^{2}\) = .09.

There is one new aspect here, the statement at the end:

Sphericity correction method: GG

You’ll see something like this in traditional ANOVA outputs when a within-subjects factor has more than two levels. It turns out that traditional ANOVA methods generally get the p value wrong in these cases, because they make assumptions about the data which are normally untrue. The incorrect assumption they make is that the data is “spherical” – if you’re curious what that means, click here.

Fortunately, there are ways of correcting this error. These are called sphericity corrections (“sphericity” is the property of being spherical). The two main methods are Greenhouse-Geisser (GG) correction, and Huynh-Feldt (HF) correction. aov_car picks the most appropriate one, makes the correction, and tells you it’s done so by including that statement Sphericity correction method: GG at the end. Another way you can tell a sphericity correction has been applied is that the degrees of freedom (df) are normally not whole numbers (1.05 and 145.44 in this case).

You would normally report towards the beginning of your results that you used such a correction. For example:

“Greenhouse-Geisser corrections for non-sphericity were applied where appropriate.”

Pairwise comparisons

You can do pairwise comparisons in traditional ANOVA in the same way you do them with anovaBF. In other words, filter the data to include just the two conditions you want to compare, and run the appropriate test. There are also other ways to do this, but we don’t cover them in these intermediate-level worksheets.

Factorial ANOVA

First, preprocess the data for this example.

Enter these commands into your script, and run them:

# Create subject-level summary
wordsum <-
    words %>%
    group_by(subj, medit, congru) %>%
    summarise(rt = mean(rt))
# Create factors
wordsum$subj <- factor(wordsum$subj)
wordsum$medit <- factor(wordsum$medit)
wordsum$congru <- factor(wordsum$congru)

A traditional ANOVA with one within-subjects factor and one between-subjects factor is conducted like this (Enter this command into your script, and run it):

# One w/subj, one b/subj, ANOVA
aov_car(formula = rt ~ medit + Error(subj/congru), data = wordsum)
Contrasts set to contr.sum for the following variables: medit
Anova Table (Type 3 tests)

Response: rt
        Effect           df      MSE         F  ges p.value
1        medit       2, 417 11759.86 11.15 *** .035   <.001
2       congru 1.05, 436.24  5093.68 48.44 *** .035   <.001
3 medit:congru 2.09, 436.24  5093.68 13.39 *** .020   <.001
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '+' 0.1 ' ' 1

Sphericity correction method: GG 

Explanation of command

The aov_car command for this factorial ANOVA is just combines the formula we used for a single between-subjects factor, rt ~ medit + Error(subj) with the formula we used for a single within-subjects factor, rt ~ Error(subj/congru), to give the full formula rt ~ medit + Error(subj/congru). Note that the + sign is used to combine a within-subjects component with a between-subjects component, rather than the * sign we used in anovaBF. This is just because the two commands were written by different peoplel it doesn’t have any deeper significance.

Explanation of output

The first two lines of the output are the main effect for medit and the main effect for congru, respectively. The third line, medit:congru is the interaction between these two factors. Everything else has the same meaning as in previous outputs. You’ll notice that the main effect F values are not quite the same as they were in the earlier examples on this worksheet. As we saw in our Bayesian ANOVA, different analyses can give different answers.

Two between-subject factors

First, preprocess the data for this example.

Enter these commands into your script, and run them:

# Create subject-level summary
wordsum <- words %>% group_by(subj, sex, medit) %>% summarise(rt = mean(rt))
# Create factors
wordsum$sex <- factor(wordsum$sex)
wordsum$subj <- factor(wordsum$subj)
wordsum$medit <- factor(wordsum$medit)

The command for a two between-subject factor traditional ANOVA is given below. This is much like previous commands, the only thing to note is that * is used to combine between-subjects factors in aov_car. The output can be interpreted in the same way as before.

Enter this command into your script, and run it:

# Two b/subj factors, ANOVA
aov_car(formula = rt ~ medit*sex + Error(subj), data = wordsum)
Contrasts set to contr.sum for the following variables: medit, sex
Anova Table (Type 3 tests)

Response: rt
     Effect     df     MSE         F  ges p.value
1     medit 2, 414 3781.43 11.20 *** .051   <.001
2       sex 1, 414 3781.43    5.64 * .013    .018
3 medit:sex 2, 414 3781.43      0.42 .002    .657
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '+' 0.1 ' ' 1

Two within-subject factors

First, preprocess the data for this example.

Enter these commands into your script, and run them:

# Create subject-level summary
wordsum <- words %>% group_by(subj, congru, block) %>% summarise(rt = mean(rt))
# Create factors
wordsum$congru <- factor(wordsum$congru)
wordsum$subj <- factor(wordsum$subj)
wordsum$block <- factor(wordsum$block)

The command for a two within-subject factor traditional ANOVA is given below. This is much like previous commands, the only thing to note is that * is used to combine within-subjects factors in aov_car. The output can be interpreted in the same way as before.

Enter this command into your script, and run it:

# Two w/subj factors, ANOVA
aov_car(formula = rt ~ Error(subj/congru*block), data = wordsum)
Anova Table (Type 3 tests)

Response: rt
        Effect            df      MSE         F   ges p.value
1       congru  1.04, 437.22 16222.93 45.80 ***  .031   <.001
2        block  1.68, 702.77  1608.56    2.43 + <.001    .099
3 congru:block 3.76, 1576.44   421.40      1.34 <.001    .255
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '+' 0.1 ' ' 1

Sphericity correction method: GG 

More than two factors

Unlike Bayesian ANOVA, traditional ANOVA techniques can handle designs with more than two factors quite efficiently.

First, preprocess the data for this example; Enter these commands into your script, and run them:

# Create subject-level summary
wordsum <- words %>% group_by(subj, sex, medit, congru, block) %>% summarise(rt = mean(rt))
# Create factors
wordsum$sex <- factor(wordsum$sex)
wordsum$subj <- factor(wordsum$subj)
wordsum$medit <- factor(wordsum$medit)
wordsum$congru <- factor(wordsum$congru)
wordsum$block <- factor(wordsum$block)

Now, we can run an ANOVA with all four factors; Enter this command into your script, and run it:

# Four-factor ANOVA
aov_car(formula = rt ~ sex*medit + Error(subj/congru*block), data = wordsum)
Contrasts set to contr.sum for the following variables: sex, medit
Anova Table (Type 3 tests)

Response: rt
                   Effect            df      MSE         F   ges p.value
1                     sex        1, 414 34897.39    5.79 *  .009    .017
2                   medit        2, 414 34897.39 11.43 ***  .034   <.001
3               sex:medit        2, 414 34897.39      0.39  .001    .680
4                  congru  1.05, 433.57 15058.29 49.17 ***  .033   <.001
5              sex:congru  1.05, 433.57 15058.29   6.89 **  .005    .008
6            medit:congru  2.09, 433.57 15058.29 13.83 ***  .019   <.001
7        sex:medit:congru  2.09, 433.57 15058.29      0.61 <.001    .549
8                   block  1.67, 692.52  1625.05      2.41 <.001    .100
9               sex:block  1.67, 692.52  1625.05      0.23 <.001    .757
10            medit:block  3.35, 692.52  1625.05      0.52 <.001    .686
11        sex:medit:block  3.35, 692.52  1625.05      0.29 <.001    .852
12           congru:block 3.76, 1557.63   419.60      1.35 <.001    .253
13       sex:congru:block 3.76, 1557.63   419.60      1.45 <.001    .217
14     medit:congru:block 7.52, 1557.63   419.60    1.80 + <.001    .078
15 sex:medit:congru:block 7.52, 1557.63   419.60      0.87 <.001    .532
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '+' 0.1 ' ' 1

Sphericity correction method: GG 

We get some very extensive output. This is because, when we have more than two factors in an analysis, it’s possible that there will be higher-order interactions in our data (e.g. sex:medit:congru), and there are a large number of these higher-order interactions to consider. Generally speaking, people find higher-order interactions very hard to understand or relate to their hypotheses. Trying to make sense of them is beyond this intermediate-level worksheet.

__

This material is distributed under a Creative Commons licence. CC-BY-SA 4.0.