In easystats/estimate: Estimation of Model-Based Predictions, Contrasts and Means

library(knitr)
knitr::opts_chunk$set(
  package.startup.message = FALSE,
  dpi = 300,
  collapse = TRUE,
  comment = "#>",
  warning = FALSE,
  message = TRUE
)
options(knitr.kable.NA = "", digits = 2, width = 800, modelbased_join_dots = FALSE)
pkgs <- c("ggplot2", "marginaleffects", "collapse", "Formula")
successfully_loaded <- suppressPackageStartupMessages(vapply(pkgs, requireNamespace, FUN.VALUE = logical(1L), quietly = TRUE))
can_evaluate <- all(successfully_loaded)

if (can_evaluate) {
  knitr::opts_chunk$set(eval = TRUE)
  junk <- suppressMessages(suppressPackageStartupMessages(vapply(pkgs, require, FUN.VALUE = logical(1L), quietly = TRUE, character.only = TRUE)))
} else {
  knitr::opts_chunk$set(eval = FALSE)
}

This vignette demonstrates a typical workflow using easystats packages, with a logistic regression model as an example. We will explore how the modelbased package can help to better understand our model and how to interpret results.

Preparing the data and fitting a model

The very first step is usually importing and preparing some data (recoding, re-shaping data and so on - the usual data wrangling tasks), which is easily done using the datawizard package. In this example, we use datawizard only for some minor recodings. The coffee_data data set is included in the modelbased package. The data set contains information on the effect of coffee consumption on alertness over time. The outcome variable is binary (alertness), and the predictor variables are coffee consumption (treatment) and time.

library(datawizard) # for data management, e.g. recodings

data(coffee_data, package = "modelbased")

# dichotomize outcome variable
coffee_data$alertness <- categorize(coffee_data$alertness, lowest = 0)
# rename variable
coffee_data <- data_rename(coffee_data, select = c(treatment = "coffee"))

# model
model <- glm(alertness ~ treatment * time, data = coffee_data, family = binomial())

Exploring the model - model coefficients

Let's start by examining the model coefficients. The package that manages everything related to model coefficients is the parameters package. We can use the model_parameters() function to extract the coefficients from the model. By setting exponentiate = TRUE, we can obtain the odds ratios for the coefficients.

library(parameters)

# coefficients
model_parameters(model, exponentiate = TRUE)

The model coefficients are difficult to interpret directly, in particular since we have an interaction effect. Instead, we should use the modelbased package to calculate adjusted predictions for the model.

Predicted probabilities - understanding the model

As we mentioned above, interpreting model results can be hard, and sometimes even misleading, if you only look at the regression coefficients. Instead, it is useful to estimate model-based means or probabilities for the outcome. Ab absolutely easy way to make interpretation easier is to use the modelbased package. You just need to provide your predictors of interest, so called focal terms.

Since we are interested in the interaction effect of coffee consumption (treatment) on alertness depending on different times of the day, we simply specify these two variables as focal terms in the estimate_means() function. This function calculates predictions on the response scale of the regression model. For logistic regression models, predicted probabilities are calculated. These refer to the adjusted probabilities of the outcome (higher alertness) depending on the predictor variables (treatment and time).

library(modelbased)

# predicted probabilities
predictions <- estimate_means(model, c("time", "treatment"))
predictions

We now see that high alertness was most likely for the coffee group in the afternoon time (about 75% probability of high alertness for the afternoon-coffee group).

We can also visualize these results, using the plot() method. In short, this will give us a visual interpretation of the model.

# plot predicted probabilities
plot(predictions)

We can also see that the predicted probabilities of alertness are higher for participants who consumed coffee compared to those who did not, but only in the morning and in the afternoon. Furthermore, we see differences between the coffee and the control group at each time point - but are these differences statistically significant?

Pairwise comparisons - testing the differences

To check this, we finally use the estimate_contrasts() function to perform pairwise comparisons of the predicted probabilities. This function needs to know the variables that should be compared, or contrasted. In a first step, we want to compare all levels of the variables involved in our interaction term (our focal terms from above).

# pairwise comparisons - quite long table
estimate_contrasts(model, c("time", "treatment"))

In the above output, we see all possible pairwise comparisons of the predicted probabilities. The table is quite long, but we can also group the comparisons, e.g. by the variable time.

# group comparisons by "time"
estimate_contrasts(model, "treatment", by = "time")

The output shows that the differences between the coffee and the control group are statistically significant only in the noon time.

easystats/estimate documentation built on April 5, 2025, 1:36 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

easystats/estimate
Estimation of Model-Based Predictions, Contrasts and Means

In easystats/estimate: Estimation of Model-Based Predictions, Contrasts and Means

Preparing the data and fitting a model

Exploring the model - model coefficients

Predicted probabilities - understanding the model

Pairwise comparisons - testing the differences

R Package Documentation

Browse R Packages

We want your feedback!

easystats/estimate Estimation of Model-Based Predictions, Contrasts and Means

In easystats/estimate: Estimation of Model-Based Predictions, Contrasts and Means

Preparing the data and fitting a model

Exploring the model - model coefficients

Predicted probabilities - understanding the model

Pairwise comparisons - testing the differences

R Package Documentation

Browse R Packages

We want your feedback!

easystats/estimate
Estimation of Model-Based Predictions, Contrasts and Means