Convenient ANOVA estimation for factorial designs
Description
These functions allow convenient specification of any type of ANOVAs (i.e., purely withinsubjects ANOVAs, purely betweensubjects ANOVAs, and mixed betweenwithin or splitplot ANOVAs) for data in the long format (i.e., one observation per row). If the data has more than one observation per individual and cell of the design (e.g., multiple responses per condition), the data will by automatically aggregated. The default settings reproduce results from commercial statistical packages such as SPSS or SAS. aov_ez
is called specifying the factors as character vectors, aov_car
is called using a formula similar to aov
specifying an error strata for the withinsubject factor(s), and aov_4
is called with a lme4like formula (all ANOVA functions return identical results). The returned object contains the ANOVA also fitted via base R's aov
which can be passed to e.g., lsmeans for further analysis (e.g., followup tests, contrasts, plotting, etc.). These functions employ Anova
(from the car package) to provide test of effects avoiding the somewhat unhandy format of car::Anova
.
Usage
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18  aov_ez(id, dv, data, between = NULL, within = NULL, covariate = NULL,
observed = NULL, fun.aggregate = NULL, type = afex_options("type"),
factorize = afex_options("factorize"),
check.contrasts = afex_options("check.contrasts"),
return = afex_options("return_aov"),
anova_table = list(), ..., print.formula = FALSE)
aov_car(formula, data, fun.aggregate = NULL, type = afex_options("type"),
factorize = afex_options("factorize"),
check.contrasts = afex_options("check.contrasts"),
return = afex_options("return_aov"), observed = NULL,
anova_table = list(), ...)
aov_4(formula, data, observed = NULL, fun.aggregate = NULL, type = afex_options("type"),
factorize = afex_options("factorize"),
check.contrasts = afex_options("check.contrasts"),
return = afex_options("return_aov"),
anova_table = list(), ..., print.formula = FALSE)

Arguments
formula 
A formula specifying the ANOVA model similar to 
data 
A 
fun.aggregate 
The function for aggregating the data before running the ANOVA if there is more than one observation per individual and cell of the design. The default 
type 
The type of sums of squares for the ANOVA. The default is given by 
factorize 
logical. Should between subject factors be factorized (with note) before running the analysis. he default is given by 
check.contrasts 

return 
What should be returned? The default is given by 
observed 

anova_table 

... 
Further arguments passed to 
id 

dv 

between 

within 

covariate 

print.formula 

Details
Details of ANOVA Specification
aov_ez
will concatenate all betweensubject factors using *
(i.e., producing all main effects and interactions) and all covariates by +
(i.e., adding only the main effects to the existing betweensubject factors). The withinsubject factors do fully interact with all betweensubject factors and covariates. This is essentially identical to the behavior of SPSS's glm
function.
The formula
s for aov_car
or aov_4
must contain a single Error
term specifying the ID
column and potential withinsubject factors (you can use mixed
for running mixedeffects models with multiple error terms). Factors outside the Error
term are treated as betweensubject factors (the withinsubject factors specified in the Error
term are ignored outside the Error
term; in other words, it is not necessary to specify them outside the Error
term, see Examples).
Suppressing the intercept (i.e, via 0 +
or  1
) is ignored. Specific specifications of effects (e.g., excluding terms with 
or using ^
) could be okay but is not tested. Using the I
or poly
function within the formula is not tested and not supported!
To run an ANCOVA you need to set factorize = FALSE
and make sure that all variables have the correct type (i.e., factors are factors and numeric variables are numeric and centered).
Note that the default behavior is to include calculation of the effect size generalized etasquared for which all nonmanipluated (i.e., observed) variables need to be specified via the observed
argument to obtain correct results. When changing the effect size to "pes"
(partial etasquared) or "none"
via anova_table
this becomes unnecessary.
If check.contrasts = TRUE
, contrasts will be set to "contr.sum"
for all betweensubject factors if default contrasts are not equal to "contr.sum"
or attrib(factor, "contrasts") != "contr.sum"
. (withinsubject factors are hardcoded "contr.sum"
.)
Statistical Issues
Type 3 sums of squares are default in afex. While some authors argue that socalled type 3 sums of squares are dangerous and/or problematic (most notably Venables, 2000), they are the default in many commercial statistical application such as SPSS or SAS. Furthermore, statisticians with an applied perspective recommend type 3 tests (e.g., Maxwell and Delaney, 2004). Consequently, they are the default for the ANOVA functions described here. For some more discussion on this issue see here.
Note that lower order effects (e.g., main effects) in type 3 ANOVAs are only meaningful with effects coding. That is, contrasts should be set to contr.sum
to obtain meaningful results. This is imposed automatically for the functions discussed here as long as check.contrasts
is TRUE
(the default). I nevertheless recommend to set the contrasts globally to contr.sum
via running set_sum_contrasts
. For a discussion of the other (nonrecommended) coding schemes see here.
FollowUp Contrasts and PostHoc Tests
The S3 object returned per default can be directly passed to lsmeans::lsmeans
for further analysis. This allows to test any type of contrasts that might be of interest independent of whether or not this contrast involves betweensubject variables, withinsubject variables, or a combination thereof. The general procedure to run those contrasts is the following (see Examples for a full example):
Estimate an
afex_aov
object with the function returned here. For example:x < aov_car(dv ~ a*b + (id/c), d)
Obtain a
ref.grid
object by runninglsmeans
on theafex_aov
object from step 1 using the factors involved in the contrast. For example:r < lsmeans(x, ~a:c)
Create a list containing the desired contrasts on the reference grid object from step 2. For example:
con1 < list(a_x = c(1, 1, 0, 0, 0, 0), b_x = c(0, 0, 0.5, 0.5, 0, 1))
Test the contrast on the reference grid using
contrast
. For example:contrast(r, con1)
To control for multiple testing pvalue adjustments can be specified. For example the BonferroniHolm correction:
contrast(r, con1, adjust = "holm")
Note that lsmeans allows for a variety of advanced settings and simplifiations, for example: all pairwise comparison of a single factor using one command (e.g., lsmeans(x, "a", contr = "pairwise")
) or advanced control for multiple testing by passing objects to multcomp. A comprehensive overview of the functionality is provided in the accompanying vignettes (see here).
A caveat regarding the use of lsmeans concerns the assumption of sphericity for ANOVAs including withinsubjects/repeatedmeasures factors (with more than two levels). While the ANOVA tables per default report results using the GreenhousseGeisser correction, no such correction is available when using lsmeans. This may result in anticonservative tests.
lsmeans is loaded/attached automatically when loading afex via library
or require
.
Methods for afex_aov
Objects
A full overview over the methods provided for afex_aov
objects is provided in the corresponding help page: afex_aovmethods
. The probably most important ones for endusers are summary
and anova
.
The summary
method returns, for ANOVAs containing withinsubject (repeatedmeasures) factors with more than two levels, the complete univariate analysis: Results without dfcorrection, the GreenhouseGeisser corrected results, the HyunhFeldt corrected results, and the results of the Mauchly test for sphericity.
The anova
method returns a data.frame
of class "anova"
containing the ANOVA table in numeric form (i.e., the one in slot anova_table
of a afex_aov
). This method has arguments such as correction
and es
and can be used to obtain an ANOVA table with different correction than the one initially specified.
Value
aov_car
, aov_4
, and aov_ez
are wrappers for Anova
and aov
, the return value is dependent on the return
argument. Per default, an S3 object of class "afex_aov"
is returned containing the following slots:
"anova_table"
An ANOVA table of class
c("anova", "data.frame")
."aov"
aov
object returned fromaov
(should not be used to evaluate significance of effects, but can be passed tolsmeans
for posthoc tests)."Anova"
object returned from
Anova
, an object of class"Anova.mlm"
(if withinsubjects factors are present) or of classc("anova", "data.frame")
."lm"
the object fitted with
lm
and passed toAnova
(i.e., an object of class"lm"
or"mlm"
). Also returned ifreturn = "lm"
."data"
a list containing: (1)
long
(the possibly aggregated data in long format used foraov
),wide
(the data used to fit thelm
object), andidata
(if withinsubject factors are present, theidata
argument passed tocar::Anova
). Also returned ifreturn = "data"
."information"
A list containing information such as name of dependent variable and id variable.
The print method for afex_aov
objects (invisibly) returns (and prints) the same as if return
is "nice"
: a nice ANOVA table (produced by nice
) with the following columns: Effect
, df
, MSE
(meansquared errors), F
(potentially with significant symbols), ges
(generalized etasquared), p
.
Note
Calculation of ANOVA models via aov
(which is done per default) can be comparatively slow and produce comparatively large objects for ANOVAs with many withinsubjects factors or levels. To avoid this calculation set the return argument to "nice"
. This can also be done globally via afex_options(return_aov = "nice")
. return = "nice"
also produces the default output of previous versions of afex (versions 0.13 and earlier).
The id variable and variables entered as withinsubjects (i.e., repeatedmeasures) factors are silently converted to factors. Levels of withinsubject factors are converted to valid variable names using make.names(...,unique=TRUE)
. Unused factor levels are silently dropped on all variables.
Contrasts attached to a factor as an attribute are probably not preserved and not supported.
The workhorse is aov_car
. aov_4
and aov_ez
only construe and pass an appropriate formula to aov_car
. Use print.formula = TRUE
to view this formula.
In contrast to aov
aov_car
assumes that all factors to the right of /
in the Error
term are belonging together. Consequently, Error(id/(a*b))
and Error(id/a*b)
are identical (which is not true for aov
).
Author(s)
Henrik Singmann
The design of these functions was influenced by ezANOVA
from package ez.
References
Maxwell, S. E., & Delaney, H. D. (2004). Designing Experiments and Analyzing Data: A ModelComparisons Perspective. Mahwah, N.J.: Lawrence Erlbaum Associates.
Venables, W.N. (2000). Exegeses on linear models. Paper presented to the SPlus User's Conference, Washington DC, 89 October 1998, Washington, DC. Available from: http://www.stats.ox.ac.uk/pub/MASS3/Exegeses.pdf
See Also
Various methods for objects of class afex_aov
are available: afex_aovmethods
nice
creates the nice ANOVA tables which is by default printed. See also there for a slightly longer discussion of the available effect sizes.
mixed
provides a (formula) interface for obtaining pvalues for mixedmodels via lme4.
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171  ##########################
## 1: Specifying ANOVAs ##
##########################
# Example using a purely withinsubjects design
# (Maxwell & Delaney, 2004, Chapter 12, Table 12.5, p. 578):
data(md_12.1)
aov_ez("id", "rt", md_12.1, within = c("angle", "noise"),
anova_table=list(correction = "none", es = "none"))
# Default output
aov_ez("id", "rt", md_12.1, within = c("angle", "noise"))
# examples using obk.long (see ?obk.long), a long version of the OBrienKaiser dataset (car package).
# Data is a splitplot or mixed design: contains both within and betweensubjects factors.
data(obk.long, package = "afex")
# estimate mixed ANOVA on the full design:
aov_car(value ~ treatment * gender + Error(id/(phase*hour)),
data = obk.long, observed = "gender")
aov_4(value ~ treatment * gender + (phase*hourid),
data = obk.long, observed = "gender")
aov_ez("id", "value", obk.long, between = c("treatment", "gender"),
within = c("phase", "hour"), observed = "gender")
# the three calls return the same ANOVA table:
## Effect df MSE F ges p.value
## 1 treatment 2, 10 22.81 3.94 + .20 .05
## 2 gender 1, 10 22.81 3.66 + .11 .08
## 3 treatment:gender 2, 10 22.81 2.86 .18 .10
## 4 phase 1.60, 15.99 5.02 16.13 *** .15 .0003
## 5 treatment:phase 3.20, 15.99 5.02 4.85 * .10 .01
## 6 gender:phase 1.60, 15.99 5.02 0.28 .003 .71
## 7 treatment:gender:phase 3.20, 15.99 5.02 0.64 .01 .61
## 8 hour 1.84, 18.41 3.39 16.69 *** .13 <.0001
## 9 treatment:hour 3.68, 18.41 3.39 0.09 .002 .98
## 10 gender:hour 1.84, 18.41 3.39 0.45 .004 .63
## 11 treatment:gender:hour 3.68, 18.41 3.39 0.62 .01 .64
## 12 phase:hour 3.60, 35.96 2.67 1.18 .02 .33
## 13 treatment:phase:hour 7.19, 35.96 2.67 0.35 .009 .93
## 14 gender:phase:hour 3.60, 35.96 2.67 0.93 .01 .45
## 15 treatment:gender:phase:hour 7.19, 35.96 2.67 0.74 .02 .65
# "numeric" variables are per default converted to factors (as long as factorize = TRUE):
obk.long$hour2 < as.numeric(as.character(obk.long$hour))
# gives same results as calls before
aov_car(value ~ treatment * gender + Error(id/hour2*phase),
data = obk.long, observed = c("gender"))
# ANCOVA: adding a covariate (necessary to set factorize = FALSE)
aov_car(value ~ treatment * gender + age + Error(id/(phase*hour)),
data = obk.long, observed = c("gender", "age"), factorize = FALSE)
aov_4(value ~ treatment * gender + age + (phase*hourid),
data = obk.long, observed = c("gender", "age"), factorize = FALSE)
aov_ez("id", "value", obk.long, between = c("treatment", "gender"),
within = c("phase", "hour"), covariate = "age",
observed = c("gender", "age"), factorize = FALSE)
# aggregating over one withinsubjects factor (phase), with warning:
aov_car(value ~ treatment * gender + Error(id/hour), data = obk.long, observed = "gender")
aov_ez("id", "value", obk.long, c("treatment", "gender"), "hour", observed = "gender")
# aggregating over both withinsubjects factors (again with warning),
# only betweensubjects factors:
aov_car(value ~ treatment * gender + Error(id), data = obk.long, observed = c("gender"))
aov_4(value ~ treatment * gender + (1id), data = obk.long, observed = c("gender"))
aov_ez("id", "value", obk.long, between = c("treatment", "gender"), observed = "gender")
# only withinsubject factors (ignoring betweensubjects factors)
aov_car(value ~ Error(id/(phase*hour)), data = obk.long)
aov_4(value ~ (phase*hourid), data = obk.long)
aov_ez("id", "value", obk.long, within = c("phase", "hour"))
### changing defaults of ANOVA table:
# no dfcorrection & partial etasquared:
aov_car(value ~ treatment * gender + Error(id/(phase*hour)),
data = obk.long, anova_table = list(correction = "none", es = "pes"))
# no dfcorrection and no MSE
aov_car(value ~ treatment * gender + Error(id/(phase*hour)),
data = obk.long,observed = "gender",
anova_table = list(correction = "none", MSE = FALSE))
###########################
## 2: Followup Analysis ##
###########################
# use data as above
data(obk.long, package = "afex")
# 1. obtain afex_aov object:
a1 < aov_ez("id", "value", obk.long, between = c("treatment", "gender"),
within = c("phase", "hour"), observed = "gender")
# 1b. plot data:
lsmip(a1, gender ~ hour  treatment+phase)
# 2. obtain reference grid object:
r1 < lsmeans(a1, ~treatment +phase)
r1
# 3. create list of contrasts on the reference grid:
c1 < list(
A_B_pre = c(0, 1, 1, rep(0, 6)), # A versus B for pretest
A_B_comb = c(0, 0, 0, 0, 0.5, 0.5, 0, 0.5, 0.5), # A vs. B for post and followup combined
effect_post = c(0, 0, 0, 1, 0.5, 0.5, 0, 0, 0), # control versus A&B post
effect_fup = c(0, 0, 0, 0, 0, 0, 1, 0.5, 0.5), # control versus A&B followup
effect_comb = c(0, 0, 0, 0.5, 0.25, 0.25, 0.5, 0.25, 0.25) # control versus A&B combined
)
# 4. test contrasts on reference grid:
contrast(r1, c1)
# same as before, but using BonferroniHolm correction for multiple testing:
contrast(r1, c1, adjust = "holm")
# 2. (alternative): all pairwise comparisons of treatment:
lsmeans(a1, "treatment", contr = "pairwise")
#######################
## 3: Other examples ##
#######################
data(obk.long, package = "afex")
# replicating ?Anova using aov_car:
obk_anova < aov_car(value ~ treatment * gender + Error(id/(phase*hour)),
data = obk.long, type = 2)
# in contrast to aov you do not need the withinsubject factors outside Error()
str(obk_anova, 1, give.attr = FALSE)
## List of 6
## $ anova_table:Classes 'anova' and 'data.frame': 15 obs. of 6 variables:
## $ aov :List of 5
## $ Anova :List of 14
## $ lm :List of 13
## $ data :List of 3
## $ information:List of 5
obk_anova$Anova
## Type II Repeated Measures MANOVA Tests: Pillai test statistic
## Df test stat approx F num Df den Df Pr(>F)
## (Intercept) 1 0.970 318 1 10 0.0000000065 ***
## treatment 2 0.481 5 2 10 0.03769 *
## gender 1 0.204 3 1 10 0.14097
## treatment:gender 2 0.364 3 2 10 0.10447
## phase 1 0.851 26 2 9 0.00019 ***
## treatment:phase 2 0.685 3 4 20 0.06674 .
## gender:phase 1 0.043 0 2 9 0.82000
## treatment:gender:phase 2 0.311 1 4 20 0.47215
## hour 1 0.935 25 4 7 0.00030 ***
## treatment:hour 2 0.301 0 8 16 0.92952
## gender:hour 1 0.293 1 4 7 0.60237
## treatment:gender:hour 2 0.570 1 8 16 0.61319
## phase:hour 1 0.550 0 8 3 0.83245
## treatment:phase:hour 2 0.664 0 16 8 0.99144
## gender:phase:hour 1 0.695 1 8 3 0.62021
## treatment:gender:phase:hour 2 0.793 0 16 8 0.97237
## 
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
