Marginal Means

knitr::opts_chunk$set(
  collapse = TRUE,
  fig.width = 6,
  fig.asp = .4,
  warning = FALSE,
  message = FALSE,
  comment = "#>"
)
library("tidyverse")
library("kableExtra")

In the context of this package, "marginal means" refer to the values obtained by this three step process:

  1. Construct a "grid" of predictor values with all combinations of categorical variables, and where numeric variables are held at their means.
  2. Calculate adjusted predictions for each cell in that grid.
  3. Take the average of those adjusted predictions across one dimension of the grid to obtain the marginal means.

For example, consider a model with a numeric, a factor, and a logical predictor:

library(marginaleffects)

dat <- mtcars
dat$cyl <- as.factor(dat$cyl)
dat$am <- as.logical(dat$am)
mod <- lm(mpg ~ hp + cyl + am, data = dat)

Using the predictions function, we set the hp variable at its mean and compute predictions for all combinations for am and cyl:

predictions(mod, variables = c("am", "cyl"))

For illustration purposes, it is useful to reshape the above results:

pred <- predictions(mod, variables = c("am", "cyl")) %>%
    select(cyl, am, predicted) %>%
    pivot_wider(names_from = "am", values_from = "predicted") %>%
    rowwise() %>%
    mutate(`Marginal mean of cyl` = mean(c(`TRUE`, `FALSE`)))
row <- data.frame(x = "Marginal means of am",
                  y = mean(pred[["TRUE"]]),
                  z = mean(pred[["FALSE"]]))
colnames(row) <- colnames(pred)[1:3]
pred <- bind_rows(pred, row)
for (i in 2:ncol(pred)) {
    pred[[i]] <- sprintf("%.1f", pred[[i]])
}
pred[pred == "NA"] <- ""
kbl(pred) %>% 
    kable_styling() %>%
    add_header_above(c(" " = 1, "am" = 2, " " = 1))

The marginal means of am and cyl are obtained by taking the mean of the adjusted predictions across cells. The marginalmeans function gives us the same results easily:

marginalmeans(mod)

The same results can be obtained using the very powerful emmeans package:

library(emmeans)
emmeans(mod, specs = "cyl")
emmeans(mod, specs = "am")

Tidy summaries

The summary, tidy, and glance functions are also available to summarize and manipulate the results:

me <- marginalmeans(mod)

tidy(me)

glance(me)

summary(me)

Thanks to those tidiers, we can also present the results in the style of a regression table using the modelsummary package:

library("modelsummary")

modelsummary(me,
             title = "Estimated Marginal Means",
             estimate = "{estimate} ({std.error}){stars}",
             statistic = NULL,
             group = term + group ~ model)

Case study: Multinomial Logit

This example requires version 0.2.0 of the marginaleffects package.

To begin, we generate data and estimate a large model:

library(nnet)
library(marginaleffects)

set.seed(1839)
n <- 1200
x <- factor(sample(letters[1:3], n, TRUE))
y <- vector(length = n)
y[x == "a"] <- sample(letters[4:6], sum(x == "a"), TRUE)
y[x == "b"] <- sample(letters[4:6], sum(x == "b"), TRUE, c(1 / 4, 2 / 4, 1 / 4))
y[x == "c"] <- sample(letters[4:6], sum(x == "c"), TRUE, c(1 / 5, 3 / 5, 2 / 5))

dat <- data.frame(x = x, y = factor(y))
tmp <- as.data.frame(replicate(20, factor(sample(letters[7:9], n, TRUE))))
dat <- cbind(dat, tmp)
void <- capture.output({
    mod <- multinom(y ~ ., dat)
})

Try to compute marginal means, but realize that you are trying to get marginal means for the wrong type of prediction:

marginalmeans(mod)

Try to compute marginal means, but realize that your grid won’t fit in memory:

marginalmeans(mod, type = "probs")

Use the variables and variables_grid arguments to compute marginal means over a more reasonably sized grid:

marginalmeans(mod,
              type = "probs",
              variables = c("x", "V1"),
              variables_grid = paste0("V", 2:5))


Try the marginaleffects package in your browser

Any scripts or data that you put into this service are public.

marginaleffects documentation built on Oct. 19, 2021, 1:09 a.m.