get_emmeans: Consistent API for 'emmeans' and 'marginaleffects'
In modelbased: Estimation of Model-Based Predictions, Contrasts and Means

get_emcontrasts

R Documentation

Consistent API for 'emmeans' and 'marginaleffects'

Description

These functions are convenient wrappers around the emmeans and the marginaleffects packages. They are mostly available for developers who want to leverage a unified API for getting model-based estimates, and regular users should use the ⁠estimate_*⁠ set of functions.

The get_emmeans(), get_emcontrasts() and get_emtrends() functions are wrappers around emmeans::emmeans() and emmeans::emtrends().

Usage

get_emcontrasts(
  model,
  contrast = NULL,
  by = NULL,
  predict = NULL,
  comparison = "pairwise",
  transform = NULL,
  keep_iterations = FALSE,
  verbose = TRUE,
  ...
)

get_emmeans(
  model,
  by = "auto",
  predict = NULL,
  transform = NULL,
  keep_iterations = FALSE,
  verbose = TRUE,
  ...
)

get_emtrends(
  model,
  trend = NULL,
  by = NULL,
  keep_iterations = FALSE,
  verbose = TRUE,
  ...
)

get_marginalcontrasts(
  model,
  contrast = NULL,
  by = NULL,
  predict = NULL,
  ci = 0.95,
  comparison = "pairwise",
  estimate = getOption("modelbased_estimate", "typical"),
  p_adjust = "none",
  transform = NULL,
  keep_iterations = FALSE,
  verbose = TRUE,
  ...
)

get_marginalmeans(
  model,
  by = "auto",
  predict = NULL,
  ci = 0.95,
  estimate = getOption("modelbased_estimate", "typical"),
  transform = NULL,
  keep_iterations = FALSE,
  verbose = TRUE,
  ...
)

get_marginaltrends(
  model,
  trend = NULL,
  by = NULL,
  ci = 0.95,
  p_adjust = "none",
  transform = NULL,
  keep_iterations = FALSE,
  verbose = TRUE,
  ...
)

Arguments

`model`	A statistical model.
`contrast`	A character vector indicating the name of the variable(s) for which to compute the contrasts, optionally including representative values or levels at which contrasts are evaluated (e.g., `contrast="x=c('a','b')"`).
`by`	The (focal) predictor variable(s) at which to evaluate the desired effect / mean / contrasts. Other predictors of the model that are not included here will be collapsed and "averaged" over (the effect will be estimated across them). `by` can be a character (vector) naming the focal predictors, optionally including representative values or levels at which focal predictors are evaluated (e.g., `by="x=c(1,2)"`). When `estimate` is not `"average"`, the `by` argument is used to create a "reference grid" or "data grid" with representative values for the focal predictors. In this case, `by` can also be list of named elements. See details in `insight::get_datagrid()` to learn more about how to create data grids for predictors of interest.
`predict`	Is passed to the `type` argument in `emmeans::emmeans()` (when `backend = "emmeans"`) or in `marginaleffects::avg_predictions()` (when `backend = "marginaleffects"`). For emmeans, see also this vignette. Valid options for `predict` are: `backend = "marginaleffects"`: `predict` can be `"response"`, `"link"`, `"inverse_link"` or any valid `type` option supported by model's class `predict()` method (e.g., for zero-inflation models from package glmmTMB, you can choose `predict = "zprob"` or `predict = "conditional"` etc., see glmmTMB::predict.glmmTMB). By default, when `predict = NULL`, the most appropriate transformation is selected, which usually returns predictions or contrasts on the response-scale. The `"inverse_link"` is a special option, comparable to marginaleffects' `invlink(link)` option. It will calculate predictions on the link scale and then back-transform to the response scale. `backend = "emmeans"`: `predict` can be `"response"`, `"link"`, `"mu"`, `"unlink"`, or `"log"`. If `predict = NULL` (default), the most appropriate transformation is selected (which usually is `"response"`). `"link"` will leave the values on scale of the linear predictors. `"response"` (or `NULL`) will transform them on scale of the response variable. Thus for a logistic model, `"link"` will give estimations expressed in log-odds (probabilities on logit scale) and `"response"` in terms of probabilities. To predict distributional parameters (called "dpar" in other packages), for instance when using complex formulae in `brms` models, the `predict` argument can take the value of the parameter you want to estimate, for instance `"sigma"`, `"kappa"`, etc. `"response"` and `"inverse_link"` both return predictions on the response scale, however, `"response"` first calculates predictions on the response scale for each observation and then aggregates them by groups or levels defined in `by`. `"inverse_link"` first calculates predictions on the link scale for each observation, then aggregates them by groups or levels defined in `by`, and finally back-transforms the predictions to the response scale. Both approaches have advantages and disadvantages. `"response"` usually produces less biased predictions, but confidence intervals might be outside reasonable bounds (i.e., for instance can be negative for count data). The `"inverse_link"` approach is more robust in terms of confidence intervals, but might produce biased predictions. In particular for mixed models, using `"response"` is recommended, because averaging across random effects groups is more accurate.
`comparison`	Specify the type of contrasts or tests that should be carried out. When `backend = "emmeans"`, can be one of `"pairwise"`, `"poly"`, `"consec"`, `"eff"`, `"del.eff"`, `"mean_chg"`, `"trt.vs.ctrl"`, `"dunnett"`, `"wtcon"` and some more. See also `method` argument in emmeans::contrast and the `?emmeans::emmc-functions`. For `backend = "marginaleffects"`, can be a numeric value, vector, or matrix, a string equation specifying the hypothesis to test, a string naming the comparison method, a formula, or a function. Strings, string equations and formula are probably the most common options and described below. For other options and detailed descriptions of those options, see also marginaleffects::comparisons and this website. String: One of `"pairwise"`, `"reference"`, `"sequential"`, `"meandev"` `"meanotherdev"`, `"poly"`, `"helmert"`, or `"trt_vs_ctrl"`. String equation: To identify parameters from the output, either specify the term name, or `"b1"`, `"b2"` etc. to indicate rows, e.g.:`"hp = drat"`, `"b1 = b2"`, or `"b1 + b2 + b3 = 0"`. Formula: A formula like `comparison ~ pairs \| group`, where the left-hand side indicates the type of comparison (`difference` or `ratio`), the right-hand side determines the pairs of estimates to compare (`reference`, `sequential`, `meandev`, etc., see string-options). Optionally, comparisons can be carried out within subsets by indicating the grouping variable after a vertical bar ( `\|`).
`transform`	A function applied to predictions and confidence intervals to (back-) transform results, which can be useful in case the regression model has a transformed response variable (e.g., `lm(log(y) ~ x)`). For Bayesian models, this function is applied to individual draws from the posterior distribution, before computing summaries. Can also be `TRUE`, in which case `insight::get_transformation()` is called to determine the appropriate transformation-function. Note that no standard errors are returned when transformations are applied.
`keep_iterations`	If `TRUE`, will keep all iterations (draws) of bootstrapped or Bayesian models. They will be added as additional columns named `iter_1`, `iter_2`, and so on. If `keep_iterations` is a positive number, only as many columns as indicated in `keep_iterations` will be added to the output. You can reshape them to a long format by running `bayestestR::reshape_iterations()`.
`verbose`	Use `FALSE` to silence messages and warnings.
`...`	Other arguments passed, for instance, to `insight::get_datagrid()`, to functions from the emmeans or marginaleffects package, or to process Bayesian models via `bayestestR::describe_posterior()`. Examples: `insight::get_datagrid()`: Argument such as `length`, `digits` or `range` can be used to control the (number of) representative values. marginaleffects: Internally used functions are `avg_predictions()` for means and contrasts, and `avg_slope()` for slopes. Therefore, arguments for instance like `vcov`, `equivalence`, `df`, `slope` or even `newdata` can be passed to those functions. A `weights` argument is passed to the `wts` argument in `avg_predictions()` or `avg_slopes()`, however, weights can only be applied when `estimate` is `"average"` or `"population"` (i.e. for those marginalization options that do not use data grids). Other arguments, such as `re.form` or `allow.new.levels`, may be passed to `predict()` (which is internally used by marginaleffects) if supported by that model class. emmeans: Internally used functions are `emmeans()` and `emtrends()`. Additional arguments can be passed to these functions. Bayesian models: For Bayesian models, parameters are cleaned using `describe_posterior()`, thus, arguments like, for example, `centrality`, `rope_range`, or `test` are passed to that function.
`trend`	A character indicating the name of the variable for which to compute the slopes.
`ci`	Confidence Interval (CI) level. Default to `0.95` (`⁠95%⁠`).
`estimate`	The `estimate` argument determines how predictions are averaged ("marginalized") over variables not specified in `by` or `contrast` (non-focal predictors). It controls whether predictions represent a "typical" individual, an "average" individual from the sample, or an "average" individual from a broader population. `"typical"` (Default): Calculates predictions for a balanced data grid representing all combinations of focal predictor levels (specified in `by`). For non-focal numeric predictors, it uses the mean; for non-focal categorical predictors, it marginalizes (averages) over the levels. This represents a "typical" observation based on the data grid and is useful for comparing groups. It answers: "What would the average outcome be for a 'typical' observation?". This is the default approach when estimating marginal means using the emmeans package. `"average"`: Calculates predictions for each observation in the sample and then averages these predictions within each group defined by the focal predictors. This reflects the sample's actual distribution of non-focal predictors, not a balanced grid. It answers: "What is the predicted value for an average observation in my data?" `"population"`: "Clones" each observation, creating copies with all possible combinations of focal predictor levels. It then averages the predictions across these "counterfactual" observations (non-observed permutations) within each group. This extrapolates to a hypothetical broader population, considering "what if" scenarios. It answers: "What is the predicted response for the 'average' observation in a broader possible target population?" This approach entails more assumptions about the likelihood of different combinations, but can be more apt to generalize. This is also the option that should be used for G-computation (Chatton and Rohrer 2024). You can set a default option for the `estimate` argument via `options()`, e.g. `options(modelbased_estimate = "average")`
`p_adjust`	The p-values adjustment method for frequentist multiple comparisons. For `estimate_slopes()`, multiple comparison only occurs for Johnson-Neyman intervals, i.e. in case of interactions with two numeric predictors (one specified in `trend`, one in `by`). In this case, the `"esarey"` option is recommended, but `p_adjust` can also be one of `"none"` (default), `"hochberg"`, `"hommel"`, `"bonferroni"`, `"BH"`, `"BY"`, `"fdr"`, `"tukey"`, `"sidak"`, or `"holm"`.

Examples


# Basic usage
model <- lm(Sepal.Width ~ Species, data = iris)
get_emcontrasts(model)

## Not run: 
# Dealing with interactions
model <- lm(Sepal.Width ~ Species * Petal.Width, data = iris)
# By default: selects first factor
get_emcontrasts(model)
# Or both
get_emcontrasts(model, contrast = c("Species", "Petal.Width"), length = 2)
# Or with custom specifications
get_emcontrasts(model, contrast = c("Species", "Petal.Width=c(1, 2)"))
# Or modulate it
get_emcontrasts(model, by = "Petal.Width", length = 4)

## End(Not run)


model <- lm(Sepal.Length ~ Species + Petal.Width, data = iris)

# By default, 'by' is set to "Species"
get_emmeans(model)

## Not run: 
# Overall mean (close to 'mean(iris$Sepal.Length)')
get_emmeans(model, by = NULL)

# One can estimate marginal means at several values of a 'modulate' variable
get_emmeans(model, by = "Petal.Width", length = 3)

# Interactions
model <- lm(Sepal.Width ~ Species * Petal.Length, data = iris)

get_emmeans(model)
get_emmeans(model, by = c("Species", "Petal.Length"), length = 2)
get_emmeans(model, by = c("Species", "Petal.Length = c(1, 3, 5)"), length = 2)

## End(Not run)


## Not run: 
model <- lm(Sepal.Width ~ Species * Petal.Length, data = iris)

get_emtrends(model)
get_emtrends(model, by = "Species")
get_emtrends(model, by = "Petal.Length")
get_emtrends(model, by = c("Species", "Petal.Length"))

## End(Not run)

model <- lm(Petal.Length ~ poly(Sepal.Width, 4), data = iris)
get_emtrends(model)
get_emtrends(model, by = "Sepal.Width")


model <- lm(Sepal.Length ~ Species + Petal.Width, data = iris)

# By default, 'by' is set to "Species"
get_marginalmeans(model)

# Overall mean (close to 'mean(iris$Sepal.Length)')
get_marginalmeans(model, by = NULL)

## Not run: 
# One can estimate marginal means at several values of a 'modulate' variable
get_marginalmeans(model, by = "Petal.Width", length = 3)

# Interactions
model <- lm(Sepal.Width ~ Species * Petal.Length, data = iris)

get_marginalmeans(model)
get_marginalmeans(model, by = c("Species", "Petal.Length"), length = 2)
get_marginalmeans(model, by = c("Species", "Petal.Length = c(1, 3, 5)"), length = 2)

## End(Not run)


model <- lm(Sepal.Width ~ Species * Petal.Length, data = iris)

get_marginaltrends(model, trend = "Petal.Length", by = "Species")
get_marginaltrends(model, trend = "Petal.Length", by = "Petal.Length")
get_marginaltrends(model, trend = "Petal.Length", by = c("Species", "Petal.Length"))

modelbased documentation built on April 12, 2025, 2:22 a.m.