cofad"
In cofad: Contrast Analyses for Factorial Designs

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

Citation

If you use cofad, please cite it in your work as:

Burkhardt, M., & Titz, J. (2019). cofad: Contrast analysis for factorial designs. R package version 0.1.0. https://CRAN.R-project.org/package=cofad

Introduction

Cofad is an R package for conducting COntrast analysis in FActorial Designs like ANOVAs. If contrast analysis was to win a price it would be the one for the most underestimated, underused statistical technique. This is a pitty because in every case a contrast analysis is at least as good as an ANOVA, but in most cases it is actually better. To the question why, the simplest answer is that contrast analysis gets rid off the unspecific omnibus-hypothesis there are differences somewhere and substitutes it with a very specific numerical hypothesis. Furthermore, contrast analysis focuses on effects instead of significance. This is expressed doubly: First, there are three different effect sizes for contrast analysis: $r_\mathrm{effectsize}$, $r_\mathrm{contrast}$ and $r_\mathrm{alerting}$. Second, the effect size refers not to the data but to the tested hypothesis. The larger the effect, the more this speaks for the hypothesis. One can even compare different hypotheses against each other (hello experimentum crucis) by looking at the effect size for each hypothesis.

If we sparked your interest we recommend to read some introductory literature like @Rosenthal2000, @furr2004, or, for the German-speaking audience, @sedlmeier2018. Contrast analysis is actually really easy to understand if you know what a correlation is. In this vignette we assume you know some basics about contrast analysis and want to use it for your own data. We will start with between-subjects designs and then show applications for within designs and mixed designs.

Between-Subjects Designs

Let us first load the package:

library(cofad)

Now we need some data and hypotheses. We can simply take the data from @furr2004, where we have different empathy ratings of students from different majors.

d <- data.frame(empathy = c(51, 56, 61, 58, 54, 62, 67, 57, 65, 59, 50, 49, 47, 45,
                            44, 50, 45, 40, 49, 41),
                major = as.factor(
                  rep(c("psychology", "education", "business",
                              "chemistry"), each = 5)))
head(d)

Furr states three hypotheses:

Contrast A: Psychology majors have higher empathy scores than Education majors ($\lambda_\mathrm{psych} = 1, \lambda_\mathrm{edu} = -1$).
Contrast B: Business majors have higher empathy scores than Chemistry majors ($\lambda_\mathrm{bus} = 1, \lambda_\mathrm{chem} = -1$).
Contrast C: On average, Psychology and Education majors have higher empathy scores than Business and Chemistry majors ($\lambda_\mathrm{psych} = 1, \lambda_\mathrm{edu} = 1, \lambda_\mathrm{bus} = -1, \lambda_\mathrm{chem} = -1$).

These hypotheses are only mean comparisons, but this is a good way to start. Let's use cofad to conduct the contrast analysis:

ca <- calc_contrast(dv = empathy, between = major,
                    lambda_between = c("psychology" = 1, "education" = -1,
                                       "business" = 0, "chemistry" = 0),
                    data = d)
ca

The print method only shows some basic information, but we can use the summary method for more details:

summary(ca)

From this table, $r_\mathrm{effectsize}$ is probably the most useful statistic. It is just the correlation between the lambdas and the dependent variable, which can also easily by calculated manually:

lambdas <- rep(c(1, -1, 0, 0), each = 5)
cor(d$empathy, lambdas)

The other two hypotheses can be tested accordingly:

ca <- calc_contrast(dv = empathy, between = major,
                    lambda_between = c("psychology" = 0, "education" = 0,
                                       "business" = 1, "chemistry" = -1),
                    data = d)
ca
ca <- calc_contrast(dv = empathy, between = major,
                    lambda_between = c("psychology" = 1, "education" = 1,
                                       "business" = -1, "chemistry" = -1),
                    data = d)
ca

You will find that the numbers are identical to the ones presented in @furr2004. Now, imagine we have a more fun hypothesis and not just mean differences. From an elaborate theory we could derive that the means should be 73, 61, 51 and 38. We can test this with cofad directly because cofad will transfer the chosen lambdas into proper lambdas (the mean of the lambdas has to be 0):

ca <- calc_contrast(dv = empathy, between = major,
                    lambda_between = c("psychology" = 73, "education" = 61,
                                       "business" = 51, "chemistry" = 38),
                    data = d)
ca

The manual test shows the same effect size:

lambdas <- rep(c(73, 61, 51, 38), each = 5)
cor(d$empathy, lambdas)

Let us now do an analysis for within-subjects designs.

Within-Subjects Designs

For within designs the calculations are quite different, but cofad takes care of it and we just have to use the within parameters within and lambda_within instead of the between equivalents. As an example we use Table 16.5 from @sedlmeier2018. Reading ability was assessed for eight participants under four different conditions. The hypothesis is that you can read best wihout music, white noise reduces your reading ability and music (independent of type) reduces it even further.

d <- data.frame(reading_test = c(27, 25, 30, 29, 30, 33, 31, 35,
                                 25, 26, 32, 29, 28, 30, 32, 34,
                                 21, 25, 23, 26, 27, 26, 29, 31, 
                                 23, 24, 24, 28, 24, 26, 27, 32),
                participant = as.factor(rep(1:8, 4)),
                music = as.factor(rep(c("without music", "white noise", "classic", "jazz"), each = 8)))
head(d)
calc_contrast(dv = reading_test, within = music,
              lambda_within = c("without music" = 1.25, 
                                "white noise" = 0.25, "classic" = -0.75, "jazz" = -0.75),
             ID = participant, data = d)

You can see that the siginifance test is just a $t$-test and the reported effect size is also for a mean comparison ($g$). (The $t$-test is one-tailed, because contrast analysis has always a specific hypotheses.) When conducting the analysis manually, we can see why:

mtr <- matrix(d$reading_test, ncol = 4)
lambdas <- c(1.25, 0.25, -0.75, -0.75)
lc1 <- mtr %*% lambdas
t.test(lc1)

Only the linear combination of the dependent variable and the contrast weights for each participant is needed. With these values a normal $t$-test against 0 is conducted. While you can do this manually, using cofad is quicker and it also gives you more information such as the different effect sizes.

Mixed Designs

The idea of mixed designs is a combination of between and within factors. In this case cofad calculates the L-Value for the within factor, which is treated as a new dependent variable. Thereafter, these used for the between analysis. @Rosenthal2000, in their table 5.3, give an illustrative example: Nine children are measured four times (within), but they also belong to different groups of age (between).

There are two hypotheses:

linear increase over time (within) ($\lambda_\mathrm{1} = -3, \lambda_\mathrm{2} = -1, \lambda_\mathrm{3} = 1, \lambda_\mathrm{4} = 3$)
linear increase over age (between) ($\lambda_\mathrm{Age 8} = -1, \lambda_\mathrm{Age 10} = 0, \lambda_\mathrm{Age12} = 1$)

Let's have a look at the data and calculation:

tab53 <- data.frame(
    Var = c(3, 1, 4, 4, 5, 5, 6, 5, 7, 2, 2, 5,
            5, 6, 7, 6, 6, 8, 3, 1, 5, 4, 5, 6,
            7, 6, 8, 3, 2, 5, 6, 6, 7, 8, 8, 9),
    age = as.factor(
      rep(rep(c("Age 8", "Age 10", "Age 12"), c(3, 3, 3)), 4)
      ),
    time = as.factor(rep(1:4, c(9, 9, 9, 9))),
    ID = as.factor(rep(1:9, 4 ))
    )
head(tab53)
lambda_within <- c("1" = -3, "2" = -1, "3" = 1, "4" = 3)
lambda_between <-c("Age 8" = -1, "Age 10" = 0, "Age 12" = 1)

contr_mx <- calc_contrast(dv = Var, 
                          between = age,
                          lambda_between = lambda_between,
                          within = time,
                          lambda_within = lambda_within,
                          ID = ID, 
                          data = tab53
                          )
contr_mx

The results look like a contrast analysis for between-subject designs. Again, the summary gives some more details: The effect sizes and within group means and standard errors of the L-values.

summary(contr_mx)

This was the introduction to cofad. If you have any suggestions you can post it on our gitlab site: https://gitlab.hrz.tu-chemnitz.de/burma--tu-chemnitz.de/cofad.git