req_suggested_packages <- c("see", "performance", "ggplot2") pcheck <- lapply(req_suggested_packages, requireNamespace, quietly = TRUE) if (any(!unlist(pcheck))) { message("Required package(s) for this vignette are not available/installed and code will not be executed.") knitr::opts_chunk$set(eval = FALSE) }
options(width = 90) knitr::opts_chunk$set(dpi=72)
As all statistical models, ANOVAs have a number of assumptions that should hold for valid inferences. These assumptions are:
The most important assumption generally is the i.i.d. assumption (i.e., if it does not hold, the inferences are likely invalid), specifically the independent part. This assumption cannot be tested empirically but needs to hold on conceptual or logical grounds. For example, in an ideal completely between-subjects design each observation comes from a different participant that is randomly sampled from a population so we know that all observations are independent. Often, we collect multiple observations from the same participant in a within-subject or repeated-measures design. To ensure the i.i.d. assumption holds in this case, we need to specify an ANOVA with within-subject factors. However, if we have a data set with multiple sources of non-independence -- such as participants and items -- ANOVA models cannot be used but we have to use a mixed model.
The other assumptions can be tested empirically, either graphically or using statistical assumption tests. However, there are different opinions on how useful statistical assumptions tests are when done in an automatic manner for each ANOVA. Whereas this is the position taken in some statistics books, this runs the risk of reducing the statistical analysis to a "cookbook" or "flowchart". Real life data analysis is often more complex than such simple rules. Therefore, it is often more productive to explore ones data using both descriptive statistics and graphical displays. This data exploration should allow one to judge whether the other ANOVA assumptions hold to a sufficient degree. For example, plotting ones ANOVA results using afex_plot
and including a reasonable display of the individual data points often allows one to judge both the homogeneity of variance and the normality of the residuals assumption.
Let us take a look at all three empirically testable assumptions in detail. ANOVAs are often robust to light violations to the homogeneity of variances assumption. If this assumption is clearly violated, we have learned something important about the data, namely variance heterogeneity, that requires further study. Some further statistical solutions are discussed below.
If the main goal of an ANOVA is to see whether or not certain effects are significant, then the assumption of normality of the residuals is only required for small samples, thanks to the central limit theorem. As shown by Lumley et al. (2002), with sample sizes of a few hundred participants even extreme violations of the normality assumptions are unproblematic. So mild violations of this assumptions are usually no problem with sample sizes exceeding 30.
Finally, the default afex
behaviour is to correct for violations of sphericity using the Greenhouse-Geisser correction. Whereas this default may in some situation produce a small loss in statistical power, this seems preferable to a situation in which violations of sphericity are overlooked and tests become anti-conservative (i.e., more false positive results).
Thus, my position as the afex
developer is that an appropriate exploratory data analysis is often better than just blindly applying statistical assumption tests. Nevertheless, assumption tests are of course an important tool in the statistical toolbox and can be helpful in many situations. Thus, I am thankful to Mattan S. Ben-Shachar who has provided them for ANOVAs in afex
. The following text provides his introduction to the assumption tests based on the performance
and see
packages.
afex
comes with a set of built-in functions to help in the testing of the assumptions of ANOVA design. Generally speaking, the testable assumptions of ANOVA are^[There is also the assumptions that (a) the model is correctly specified and that (b) errors are independent, but there is no "hard" test for these assumptions.]:
performance::check_homogeneity()
. performance::check_sphericity()
. performance::check_normality()
.What follows is a brief review of these assumptions and their tests.
library(afex) library(performance) # for assumption checks
This assumption, for between subject-designs, states that the within group errors all share a common variance around the group's mean. This can be tested with Levene's test:
data(obk.long, package = "afex") o1 <- aov_ez("id", "value", obk.long, between = c("treatment", "gender")) check_homogeneity(o1)
These results indicate that homogeneity is not significantly violated.
ANOVAs are generally robust to "light" heteroscedasticity, but there are various other methods (not available in afex
) for getting robust error estimates.
Another alternative is to ditch this assumption altogether and use permutation tests (e.g. with permuco
) or bootstrapped estimates (e.g. with boot
).
data("fhch2010", package = "afex") a1 <- aov_ez("id", "log_rt", fhch2010, between = "task", within = c("density", "frequency", "length", "stimulus"))
We can use check_sphericity()
to run Mauchly's test of sphericity:
check_sphericity(a1)
We can see that both the error terms of the length:stimulus
and task:length:stimulus
interactions significantly violate the assumption of sphericity at p = 0.021. Note that as task
is a between-subjects factor, both these interaction terms share the same error term!
afex
offers both the Greenhouse-Geisser (which is used by default) and the Hyunh-Feldt corrections. emmeans
, a multivariate model can be used, which does not assume sphericity (this is used by default since afex
1.0).Both can be set globally with:
afex_options( correction_aov = "GG", # or "HF" emmeans_model = "multivariate" )
The normalicy of residuals assumption is concerned with the errors that make up the various error terms in the ANOVA. Although the Shapiro-Wilk test can be used to test for deviation from a normal distribution, this test tends to have high type-I error rates. Instead, one can visually inspect the residuals using quantile-quantile plots (AKA qq-plots). For example:
data("stroop", package = "afex") stroop1 <- subset(stroop, study == 1) stroop1 <- na.omit(stroop1) s1 <- aov_ez("pno", "rt", stroop1, within = c("condition", "congruency")) is_norm <- check_normality(s1) plot(is_norm) plot(is_norm, type = "qq")
If the residuals were normally distributed, we would see them falling close to the diagonal line, inside the 95% confidence bands around the qq-line.
We can further de-trend the plot, and show not the expected quantile, but the deviation from the expected quantile, which may help reducing visual bias.
plot(is_norm, type = "qq", detrend = TRUE)
Wow! The deviation from normalicy is now visually much more pronounced!
As with the assumption of homogeneity of variances, we can resort to using permutation tests for ANOVA tables and bootstrap estimates / contrasts.
Another popular solution is to apply a monotonic transformation to the dependent variable. This should not be done lightly, as it changes the interpretability of the results (from the observed scale to the transformed scale). Luckily for us, it is common to log transform reaction times, which we can easily do^[But note ANOVA no longer tests if any differences between the means is significantly different from 0, but if any ratio between the means is significantly different from 1.]:
s2 <- aov_ez("pno", "rt", stroop1, transformation = "log", within = c("condition", "congruency")) is_norm <- check_normality(s2) plot(is_norm, type = "qq", detrend = TRUE)
Success - after the transformation, the residuals (on the log scale) do not deviate more than expected from errors sampled from a normal distribution (are mostly contained in the 95%CI bands)!
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.