In javifar/ggeffects: Create Tidy Data Frames of Marginal Effects for 'ggplot' from Model Outputs

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  dev = "png",
  fig.width = 7,
  fig.height = 3.5,
  message = FALSE, warning = FALSE)
options(width = 800)

if (!requireNamespace("ggplot2", quietly = TRUE) ||
    !requireNamespace("parameters", quietly = TRUE) ||
    !requireNamespace("margins", quietly = TRUE)) {
  knitr::opts_chunk$set(eval = FALSE)
}

Marginal effects and predictions

There is no common language across fields regarding a unique meaning of "marginal effects". Thus, the wording throughout this package may vary. The most generic description of what ggeffects does, is: ggeffects allows us to interpret a statistical model by making predictions generated by the model when one holds the non-focal variables constant and varies the focal variable(s).

In the following, some examples are shown to make clear what is actually calculated and returned by the package's functions ggpredict(), ggemmeans() and ggeffect(), and how this differs from other functions or software packages that calculate marginal effects.

An example with a simple linear model

First, we fit s simple linear model and look at the coefficient of the only predictor, Sepal.Width.

data(iris)
model1 <- lm(Sepal.Length ~ Sepal.Width, data = iris)
coef(model1)["Sepal.Width"]

In this basic example, the coefficient we see here is the slope of the regression line:

library(ggplot2)
ggplot(iris, aes(x = Sepal.Width, y = Sepal.Length)) + 
  geom_point() +
  geom_abline(intercept = coef(model1)["(Intercept)"], 
              slope = coef(model1)["Sepal.Width"])

For this simple linear model, the slope for the regression line is always the same for each value of our predictor, Sepal.Width. We can check this by generating predictions of our model.

library(ggeffects)
pr <- ggpredict(model1, "Sepal.Width")
pr

"Predictions" returned by ggeffects are essentially interpretations of regression coefficients in terms of comparison. We can compare how much our outcome (Sepal.Length) changes on average, when the focal term (in this case: Sepal.Width) varies: For example, what is the average value of Sepal.Length for observations with a value of, say, 2 for Sepal.Width, compared to observations with a 3 for Sepal.Width?

ggeffects returns predictions for representative values of the focal term(s), hence you see many predicted values (including confidence intervals) in the output for the different values of the focal term(s).

If we now look at the differences between any two predicted values, we see that these are identical:

# Difference between predicted values for Sepal.Width = 2 and 3
pr <- ggpredict(model1, "Sepal.Width [2,3]")
round(diff(pr$predicted), 4)

# Difference between predicted values for Sepal.Width = 3 and 4
pr <- ggpredict(model1, "Sepal.Width [4,5]")
round(diff(pr$predicted), 4)

Furthermore, the difference of predicted values that differ by 1 in the focal term (Sepal.Width), equals the regression coefficient. This is because the interpretation of a regression coefficient can be seen as average difference in the outcome, "comparing two individuals that differ in one predictor [a difference of 1 for Sepal.Width in this case], while being at the same levels of all other predictors." (Gelman, Hill, Vehtari 2020, page 494). We don't have any other predictors in this example, so we don't go into deeper details here.

Thus, the association - or effect - between Sepal.Length and Sepal.Width is the same for every value of Sepal.Width. This means, that for simple linear models, the regression coefficient is also the marginal effect:

library(margins)
margins(model1)

The marginal effect here is at the same time the average marginal effect, because on average, the effect of Sepal.Width on Sepal.Length is -0.2234: when Sepal.Width changes by 1, the value of Sepal.Length changes by -0.2234 on average.

An example with a simple logistic regression model

For the next example, we simulate some data for a logistic regression model.

set.seed(123)
y <- rbinom(300, 1, c(.3, .7))
x <- rnorm(300, 2)
y_1 <- y == 1
x[y_1] <- x[y_1] + rnorm(sum(y_1), 3)

d <- data.frame(x, y)
model2 <- glm(y ~ x, family = binomial(), data = d)

coef(model2)["x"]

The regression coefficient for x (on the logit-scale) is 2.641. However, for a logistic regression, this "slope" is not constant across all values of x, because we have non-linear transformations here. We can make this more clear by looking at the predicted probabilities:

plot(ggpredict(model2, "x [all]"), ci = FALSE, add.data = TRUE)

As we can see, we have some differences in the case of logistic regression models compared to the linear regression model:

We no longer have the predicted average difference or mean in our outcome, but rather the predicted probability that our outcome is 1 for a given value of x.
Due to the non-linear transformation, the slope differs at different values for x, thus, the "marginal effect" or "association" (in terms of probabilities) is not constant across values of x.
While the regression coefficient in linear models is already on the response scale, and hence the (average) marginal effect equals the regression coefficient, we have different scales in logistic regression models: the coefficients shown in summary() are on the logit-scale (the scale of the linear predictor); exponentiating that coefficient (i.e. exp(coef(model2))) returns an odds ratio; predictions are easy to interpret in terms of probabilities, as mentioned under 1).

First, let's look at the average marginal effect of x in this model:

margins(model2)

The result indicates "the contribution of each variable on the outcome scale", i.e. the "change in the predicted probability that the outcome equals 1" (see vignettes from the margins package). On average, a unit-change in x changes the predicted probability that the outcome equals 1 by 15.4\%.

More generally speaking: The marginal effect represents the difference of (two) predictions for an (infinitesimal) change in x (the focal term). The average marginal effect represents the average slope of that predictor. In other words: the average marginal effects is one value per parameter (term), thus it can be considered as an "adjusted regression coefficient", while predicted values usually predict the average outcome for different values of x - you usually don't have just one coefficient in the latter case that represents the overall effect of x.

I personally find it less intuitive to interpret average marginal effects, in particular for non-Gaussian models, because it is harder to understand an average effect where we actually have varying effects across the range of the focal term. Instead, I rather prefer to look at predictions at different values of the focal term(s), which is what ggeffects returns by default:

ggpredict(model2, "x")

For x = -2, the predicted probability that y = 1, as estimated by our model, is zero. For x = 10, the probability is 100\%. In essence, what ggpredict() returns, are not average marginal effects, but rather the predicted values at different values of x. This makes clear that marginal effects require predictions: The marginal effect would be the difference between any two adjacent predicted values (if these are close enough in case of non-linear relationship between our outcome and the focal variable).

Marginal effects, average marginal effects or predictions?

Following these lecture-notes, marginal effects are based on predictions. The main difference is how "effects" is understood. In particular in econometrics, "marginal effects" are understood as predictions for numerical derivatives of the focal term, which is achieved by Stata's margins, dydx(varname) or R's margins::dydx(). However, marginal effects at specific values, in Stata margins, at(var1 = 5, var2 = 10), are considered as predictions.

More technically, these types of marginal effects (where non-focal terms are held constant at representative values) are calculated like this:

Calculate the mean of all the predictors in the dataset.
Calculate the marginal effect (the 1-unit, or infinitesimal, change in the focal term) when all the predictors are held at their mean.

In Stata language, they are also called "predictive margins", and this is what ggeffects returns. Thus, the language used throughout this package considers "marginal effects" as predictions, i.e. predicted values. Depending on the response scale, these are either predicted (mean) values, predicted probabilities, predicted (mean) count (for count models) etc. Currently, ggeffects does not calculate average marginal effects. See the last section below for a summary of the different meanings and definitions.

Estimated marginal means

Sometimes, the term estimated marginal means is used as well, because this is commonly used in software packages likes SPSS, but there is also a prominent R package, emmeans.

But what is the difference, for instance, between simple means and "estimated marginal" means? And why "marginal"? The idea behind marginal effects, and estimated marginal means, is that the estimated (or predicted) average outcome value is adjusted for the remaining co-variates. We shall demonstrate this with another linear model.

We first simulate some fake data, where we want to see how income affects well-being. The dataset also includes a variable on health, which we will use later.

set.seed(123)
wellbeing <- runif(300, 0, 100)
income <- rep(NA, 300)
health <- runif(300, 30, 80)
health[wellbeing < 50] <- health[wellbeing < 50] - rnorm(sum(wellbeing < 50), 30, sd = 10)

income[wellbeing < 25] <- sample(1:3, sum(wellbeing < 25), 
                    replace = TRUE, prob = c(.7, .2, .1))
income[wellbeing >= 25 & wellbeing < 50] <- sample(1:3, sum(wellbeing >= 25 & wellbeing < 50), 
                              replace = TRUE, prob = c(.5, .3, .2))
income[wellbeing >= 50 & wellbeing < 75] <- sample(1:3, sum(wellbeing >= 50 & wellbeing < 75), 
                              replace = TRUE, prob = c(.35, .35, .3))
income[wellbeing >= 75] <- sample(1:3, sum(wellbeing >= 75), 
                     replace = TRUE, prob = c(.1, .2, .7))

income <- factor(income)
levels(income) <- c("low", "middle", "high")
d <- data.frame(income, wellbeing, health)

We now fit a linear model, to look at the regression coefficients:

library(parameters)
model3 <- lm(wellbeing ~ income, data = d)

model_parameters(model3)

We can see that the average well-being is 15.5 points higher for people from middle income groups compared to those from lower income groups. People with higher income even have on average a 33.11 points higher well-being than people with lower income.

We can fairly easy calculate the predicted average well-being by summing up the intercept and the coefficient for each middle and high income. This is what ggpredict() (or ggemmeans() or ggeffect()) also does:

ggpredict(model3, "income")

In this example, the "marginal effects" (or estimated marginal means) equal the simple average values of well-being by the different income groups:

aggregate(d$wellbeing, list(d$income), mean)

However, we may conclude that the well-being is not only depending on income, but also on other factors, such as health status. health would be a confounder that impacts the association between income and wellbeing.

model4 <- lm(wellbeing ~ income + health, data = d)
compare_parameters(model3, model4)

Now we see that the effect of income on well-being is less pronounced when we take the health status into account. This "adjustment" for confounding variables can be accounted for when calculating marginal effects (or estimated marginal means). These predicted average values for wellbeing are now no longer the same as the simple group means, due to the adjustment:

ggpredict(model4, "income")

This is the difference between simple "means" and "estimated marginal means". The latter are "adjusted" means, based on the model that adjusts for confounders. Thus, these predicted means are "marginalized" (i.e. averaged) over the levels of all covariates. This is what is meant by the sentence in the first paragraph, that ggeffects is "making predictions generated by the model when one holds the non-focal variables constant". However, there are different way how to hold non-focal terms constant, and this is how ggpredict() differs from ggemmeans() and ggeffect() (described in detail in this vignette).

An example with logistic regression

We can demonstrate the same aspect of "adjusted predictions" we have seen above for linear model, for logistic regression models. Therefore, we generate some fake data again.

smoking <- data.frame(
  sex = factor(c("male", "female", "female", "male", "female", "female",
                 "male", "female", "female", "male", "male", "female",
                 "female"), 
               levels = c("male", "female")),
  smoking = factor(c("no", "yes", "yes", "yes", "yes", "no", "no", "yes",
                     "yes", "no", "no", "no", "yes"), 
                   levels = c("no", "yes")),
  age = c(10, 45, 50, 40, 45, 12, 14, 55, 60, 10, 14, 50, 40)
)

Looking at the proportions of the table, we see that many more female persons are smoking compared to male persons:

100 * prop.table(table(smoking$sex, smoking$smoking), margin = 1)

In this case, we have no "estimated" or "predicted" means or averages, but predicted probabilities. According to the table, the probability of being female and smoking is 75\%, while it's only 20\% for male persons. We get the same values for the predicted probabilities, if we run a logistic regression model:

model5 <- glm(smoking ~ sex, family = binomial(), data = smoking)

# Looking at the odds ratio for "sex"
model_parameters(model5, exponentiate = TRUE)

# Looking at the predicted probabilities for "sex"
ggpredict(model5, "sex")

The reference category for sex is male, so we can estimate the average marginal effects for female persons using margins():

margins(model5)

The interpretation is like stated above: the change in the predicted probability that the outcome equals 1 for female persons is 0.55, i.e. 55\%. This is exactly the difference between the predicted probabilities for male and female persons.

Looking at the age distribution in the sample, we might conclude that our model produces biased estimates, and therefor biased predictions. Remember the high odds ratio of our model, as shown above. Now we include age as possible confounder in our model.

model6 <- glm(smoking ~ sex + age, family = binomial(), data = smoking)

# Looking at the odds ratio for "sex"
compare_parameters(model5, model6, exponentiate = TRUE)

# Looking at the predicted probabilities for "sex"
ggpredict(model6, "sex")

As we can see, the female persons were much older than the male persons. Smoking is also associated with age and it is less likely that people smoke when they are children or younger teenagers. Adjusting for age reveals that the probability of smoking is actually higher for male persons, not female.

Conclusion: Marginal effects, conditional effects, predictions...

ggpredict() holds non-focal terms constant at their mean value (if these are continuous) or at their reference level (for factors). Thus, effects returned by ggpredict() are actually conditional effects (i.e. these are conditioned on certain (reference) levels of factors). However, ggeffect() and ggemmeans() return marginal effects, since the effects are "marginalized" (or "averaged") over the levels of factors.

In short:

ggpredict(), ggemmeans() or ggeffect() help answer the question: what do I expect the outcome to be if x = 1 (or any other specific value or level)?
marginal effects help answer the question: what is the effect of a 1 unit change in x on y?
average marginal effects help answer the question: what is the average effect of a 1 unit change in x on y?

Marginal effects measure the association between a change in the predictors and a change in the outcome. It is an effect, not a prediction. It is a change, not a level. Adjusted predictions measure the average value of the outcome for specific values or levels of predictors.

My impression is that there are many different terms: conditional and marginal effects, marginal effects and (estimated) marginal means, effects and predictions, ... At this point, it is still not definitely clear which wording should be the most appropriate for ggeffects. But: Whenever "marginal effects" are mentioned here, it is about model-based predictions at different values or levels of the focal variable(s), holding the non-focal variables constant at their mean, reference level or averaged over factor levels.

For linear models, adjusted predictions and marginal effects are the same. For non-Gaussian models, strictly speaking (and as usually defined in econometrics), "marginal effects" are not the same as estimated marginal means or conditional effects or adjusted predictions or predicted values. However, I'm revising the wording in the package documentation and vignettes, to avoid the term "marginal effects" as good as possible...