knitr::opts_chunk$set( collapse = TRUE, comment = "#>", dev = "png", fig.width = 7, fig.height = 3.5, message = FALSE, warning = FALSE, package.startup.message = FALSE ) options(width = 800) arrow_color <- "#FF00cc" my_predictions <- NULL pkgs <- c("ggplot2", "marginaleffects", "see", "parameters") options(modelbased_join_dots = FALSE) options(modelbased_select = "minimal") if (!all(insight::check_if_installed(pkgs, quietly = TRUE))) { knitr::opts_chunk$set(eval = FALSE) }
This vignette is the first in a 4-part series:
Contrasts and Pairwise Comparisons
Comparisons of Slopes, Floodlight and Spotlight Analysis (Johnson-Neyman Intervals)
A reason to compute adjusted predictions (or estimated marginal means) is to help understanding the relationship between predictors and outcome of a regression model. The next step, which often follows this, is to see if there are statistically significant differences. These could be, for example, differences between groups, i.e. between the levels of categorical predictors or whether trends differ significantly from each other.
The modelbased package provides a function, estimate_contrasts()
, which does exactly this: testing differences of predictions or marginal means for statistical significance. This is usually called contrasts or (pairwise) comparisons, or marginal effects (if the difference refers to a one-unit change of predictors). This vignette shows some examples how to use the estimate_contrasts()
function and how to test whether differences in predictions are statistically significant.
First, different examples for pairwise comparisons are shown, later we will see how to test differences-in-differences (in the emmeans package, also called interaction contrasts).
episode
, do levels differ?We start with a toy example, where we have a linear model with two categorical predictors. No interaction is involved for now.
We display a simple table of regression coefficients, created with model_parameters()
from the parameters package.
library(modelbased) library(parameters) library(ggplot2) set.seed(123) n <- 200 d <- data.frame( outcome = rnorm(n), grp = as.factor(sample(c("treatment", "control"), n, TRUE)), episode = as.factor(sample(1:3, n, TRUE)), sex = as.factor(sample(c("female", "male"), n, TRUE, prob = c(0.4, 0.6))) ) model1 <- lm(outcome ~ grp + episode, data = d) model_parameters(model1)
Let us look at the adjusted predictions.
my_predictions <- estimate_means(model1, "episode") my_predictions plot(my_predictions)
We now see that, for instance, the predicted outcome when espisode = 2
is 0.2.
We could now ask whether the predicted outcome for episode = 1
is significantly different from the predicted outcome at episode = 2
.
p <- plot(my_predictions, pointrange = list(position = position_dodge(width = 0.4))) line_data <- as.data.frame(my_predictions)[1:2, ] p + annotate( "segment", x = as.numeric(line_data$episode[1]) + 0.06, xend = as.numeric(line_data$episode[2]) - 0.06, y = line_data$Mean[1], yend = line_data$Mean[2], color = arrow_color, arrow = arrow(length = unit(0.1, "inches"), ends = "both", angle = 40) ) + ggtitle("Within \"episode\", do levels 1 and 2 differ?")
To do this, we use the estimate_contrasts()
function. By default, a pairwise comparison is performed. You can specify other comparisons as well, using the comparison
argument. For now, we go on with the simpler example of contrasts or pairwise comparisons.
# argument `comparison` defaults to "pairwise" estimate_contrasts(model1, "episode")
For our quantity of interest, the contrast between episode levels 2 and 1, we see the value 0.36, which is exactly the difference between the predicted outcome for episode = 1
(-0.16) and episode = 2
(0.20). The related p-value is 0.031, indicating that the difference between the predicted values of our outcome at these two levels of the factor episode is indeed statistically significant.
We can also define "representative values" via the contrast
or by
arguments. For example, we could specify the levels of episode
directly, to simplify the output:
estimate_contrasts(model1, contrast = "episode=c(1,2)")
The next example includes a pairwise comparison of an interaction between two categorical predictors.
model2 <- lm(outcome ~ grp * episode, data = d) model_parameters(model2)
First, we look at the predicted values of outcome for all combinations of the involved interaction term.
my_predictions <- estimate_means(model2, by = c("episode", "grp")) my_predictions plot(my_predictions)
We could now ask whether the predicted outcome for episode = 2
is significantly different depending on the level of grp
? In other words, do the groups treatment
and control
differ when episode = 2
?
p <- plot(my_predictions, pointrange = list(position = position_dodge(width = 0.4))) line_data <- as.data.frame(my_predictions)[3:4, 1:3] line_data$group_col <- "control" p + annotate( "segment", x = as.numeric(line_data$episode[1]) - 0.06, xend = as.numeric(line_data$episode[2]) + 0.06, y = line_data$Mean[1], yend = line_data$Mean[2], color = arrow_color, arrow = arrow(length = unit(0.1, "inches"), ends = "both", angle = 40) ) + ggtitle("Within level 2 of \"episode\", do treatment and control group differ?")
Again, to answer this question, we calculate all pairwise comparisons, i.e. the comparison (or test for differences) between all combinations of our focal predictors. The focal predictors we're interested here are our two variables used for the interaction.
# we want "episode = 2-2" and "grp = control-treatment" estimate_contrasts(model2, contrast = c("episode", "grp"))
For our quantity of interest, the contrast between groups treatment
and control
when episode = 2
is 0.06. We find this comparison in row 10 of the above output.
As we can see, estimate_contrasts()
returns pairwise comparisons of all possible combinations of factor levels from our focal variables. If we're only interested in a very specific comparison, we have two options to simplify the output:
We could directly formulate this comparison. Therefore, we need to know the parameters of interests (see below).
We pass specific values or levels to the contrast
argument.
estimate_means(model2, by = c("episode", "grp"))
In the above output, each row is considered as one coefficient of interest. Our groups we want to include in our comparison are rows two (grp = control
and episode = 2
) and five (grp = treatment
and episode = 2
), so our "quantities of interest" are b2
and b5
. Our null hypothesis we want to test is whether both predictions are equal, i.e. comparison = "b5 = b2"
(we could also specify "b2 = b5"
, results would be the same, just signs are switched). We can now calculate the desired comparison directly:
# compute specific contrast directly estimate_contrasts(model2, contrast = c("episode", "grp"), comparison = "b2 = b5")
Again, using representative values for the contrast
argument, we can also simplify the output using an alternative syntax:
# return pairwise comparisons for specific values, in # this case for episode = 2 in both groups estimate_contrasts(model2, contrast = c("episode=2", "grp"))
This is equivalent to the above example, where we directly specified the comparison we're interested in. However, the comparison
argument might provide more flexibility in case you want more complex comparisons. See examples below.
We can repeat the steps shown above to test any combination of group levels for differences.
For instance, we could now ask whether the predicted outcome for episode = 1
in the treatment
group is significantly different from the predicted outcome for episode = 3
in the control
group.
p <- plot(my_predictions) line_data <- as.data.frame(my_predictions)[c(2, 5), 1:3] line_data$group_col <- "treatment" p + annotate( "segment", x = as.numeric(line_data$episode[1]) + 0.06, xend = as.numeric(line_data$episode[2]) - 0.06, y = line_data$Mean[1], yend = line_data$Mean[2], color = arrow_color, arrow = arrow(length = unit(0.1, "inches"), ends = "both", angle = 40) ) + ggtitle("Do different episode levels differ between groups?")
The contrast we are interested in is between episode = 1
in the treatment
group and episode = 3
in the control
group. These are the predicted values in rows three and four (c.f. above table of predicted values), thus we comparison
whether "b4 = b3"
.
estimate_contrasts(model2, contrast = c("episode", "grp"), comparison = "b4 = b3")
Another way to produce this pairwise comparison, we can reduce the table of predicted values by providing specific values or levels in the by
or contrast
argument:
estimate_means(model2, by = c("episode=c(1,3)", "grp"))
episode = 1
in the treatment
group and episode = 3
in the control
group refer now to rows two and three in the reduced output, thus we also can obtain the desired comparison this way:
estimate_contrasts( model2, contrast = c("episode = c(1, 3)", "grp"), comparison = "b3 = b2" )
The comparison
argument also allows us to compare difference-in-differences (aka interaction contrasts). For example, is the difference between two episode levels in one group significantly different from the difference of the same two episode levels in the other group?
my_predictions <- estimate_means(model2, c("grp", "episode")) p <- plot(my_predictions, pointrange = list(position = position_dodge(width = 0.4))) line_data <- as.data.frame(my_predictions)[, 1:3, ] line_data$group_col <- "1" p + annotate( "segment", x = as.numeric(line_data$grp[1]) - 0.05, xend = as.numeric(line_data$grp[1]) - 0.05, y = line_data$Mean[1], yend = line_data$Mean[2], color = "orange", arrow = arrow(length = unit(0.1, "inches"), ends = "both", angle = 40, type = "closed") ) + annotate( "segment", x = as.numeric(line_data$grp[4]) - 0.05, xend = as.numeric(line_data$grp[4]) - 0.05, y = line_data$Mean[4], yend = line_data$Mean[5], color = "orange", arrow = arrow(length = unit(0.1, "inches"), ends = "both", angle = 40, type = "closed") ) + annotate( "segment", x = as.numeric(line_data$grp[1]) - 0.05, xend = as.numeric(line_data$grp[4]) - 0.05, y = (line_data$Mean[1] + line_data$Mean[2]) / 2, yend = (line_data$Mean[4] + line_data$Mean[5]) / 2, color = arrow_color, arrow = arrow(length = unit(0.1, "inches"), ends = "both", angle = 40, type = "closed") ) + ggtitle("Differnce-in-differences")
As a reminder, we look at the table of predictions again:
estimate_means(model2, c("episode", "grp"))
The first difference of episode levels 1 and 2 in the control group refer to rows one and two in the above table (b1
and b2
). The difference for the same episode levels in the treatment group refer to the difference between rows four and five (b4
and b5
). Thus, we have b1 - b2
and b4 - b5
, and our null hypothesis is that these two differences are equal: comparison = "(b1 - b2) = (b4 - b5)"
.
estimate_contrasts( model2, c("episode", "grp"), comparison = "(b1 - b2) = (b4 - b5)" )
Let's replicate this step-by-step:
episode = 1
in the control group is 0.03.episode = 2
in the control group is 0.23.episode = 1
in the treatment group is -0.39.episode = 2
in the treatment group is 0.18.While the current implementation in estimate_contrasts()
already covers many common use cases for testing contrasts and pairwise comparison, there still might be the need for more sophisticated comparisons. In this case, we recommend using the marginaleffects package directly. Some further related recommended readings are the vignettes about Comparisons or Hypothesis Tests.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.