# emmeans: Estimated marginal means (Least-squares means) In emmeans: Estimated Marginal Means, aka Least-Squares Means

## Description

Compute estimated marginal means (EMMs) for specified factors or factor combinations in a linear model; and optionally, comparisons or contrasts among them. EMMs are also known as least-squares means.

## Usage

 ```1 2 3``` ```emmeans(object, specs, by = NULL, fac.reduce = function(coefs) apply(coefs, 2, mean), contr, options = get_emm_option("emmeans"), weights, offset, trend, ..., tran) ```

## Arguments

 `object` An object of class `emmGrid`; or a fitted model object that is supported, such as the result of a call to `lm` or `lmer`. Many fitted-model objects are supported; see `vignette("models", "emmeans")` for details. `specs` A `character` vector specifying the names of the predictors over which EMMs are desired. `specs` may also be a `formula` or a `list` (optionally named) of valid `spec`s. Use of formulas is described in the Overview section below. `by` A character vector specifying the names of predictors to condition on. `fac.reduce` A function that combines the rows of a matrix into a single vector. This implements the “marginal averaging” aspect of EMMs. The default is the mean of the rows. Typically if it is overridden, it would be some kind of weighted mean of the rows. If `fac.reduce` is nonlinear, bizarre results are likely, and EMMs will not be interpretable. NOTE: If the `weights` argument is non-missing, `fac.reduce` is ignored. `contr` A character value or `list` specifying contrasts to be added. See `contrast`. NOTE: `contr` is ignored when `specs` is a formula. `options` If non-`NULL`, a named `list` of arguments to pass to `update.emmGrid`, just after the object is constructed. (Options may also be included in `...`; see the ‘options’ section below.) `weights` Character value, numeric vector, or numeric matrix specifying weights to use in averaging predictions. See “Weights” section below. `offset` Numeric vector or scalar. If specified, this adds an offset to the predictions, or overrides any offset in the model or its reference grid. If a vector of length differing from the number of rows in the result, it is subsetted or cyclically recycled. `trend` This is now deprecated. Use `emtrends` instead. `...` When `object` is not already a `"emmGrid"` object, these arguments are passed to `ref_grid`. Common examples are `at`, `cov.reduce`, `data`, codetype, `transform`, `df`, `nesting`, and `vcov.`. Model-type-specific options (see `vignette("models", "emmeans")`), commonly `mode`, may be used here as well. In addition, if the model formula contains references to variables that are not predictors, you must provide a `params` argument with a list of their names. These arguments may also be used in lieu of `options`. See the ‘Options’ section below. `tran` Placeholder to prevent it from being included in `...`. If non-missing, it is added to 'options'. See the ‘Options’ section.

## Details

Users should also consult the documentation for `ref_grid`, because many important options for EMMs are implemented there, via the `...` argument.

## Value

When `specs` is a `character` vector or one-sided formula, an object of class `"emmGrid"`. A number of methods are provided for further analysis, including `summary.emmGrid`, `confint.emmGrid`, `test.emmGrid`, `contrast.emmGrid`, and `pairs.emmGrid`. When `specs` is a `list` or a `formula` having a left-hand side, the return value is an `emm_list` object, which is simply a `list` of `emmGrid` objects.

## Overview

Estimated marginal means or EMMs (sometimes called least-squares means) are predictions from a linear model over a reference grid; or marginal averages thereof. The `ref_grid` function identifies/creates the reference grid upon which `emmeans` is based.

For those who prefer the terms “least-squares means” or “predicted marginal means”, functions `lsmeans` and `pmmeans` are provided as wrappers. See `wrappers`.

If `specs` is a `formula`, it should be of the form `~ specs`, `~ specs | by`, `contr ~ specs`, or `contr ~ specs | by`. The formula is parsed and the variables therein are used as the arguments `specs`, `by`, and `contr` as indicated. The left-hand side is optional, but if specified it should be the name of a contrast family (e.g., `pairwise`). Operators like `*` or `:` are needed in the formula to delineate names, but otherwise are ignored.

In the special case where the mean (or weighted mean) of all the predictions is desired, specify `specs` as `~ 1` or `"1"`.

A number of standard contrast families are provided. They can be identified as functions having names ending in `.emmc` – see the documentation for `emmc-functions` for details – including how to write your own `.emmc` function for custom contrasts.

## Weights

If `weights` is a vector, its length must equal the number of predictions to be averaged to obtain each EMM. If a matrix, each row of the matrix is used in turn, wrapping back to the first row as needed. When in doubt about what is being averaged (or how many), first call `emmeans` with `weights = "show.levels"`.

If `weights` is a string, it should partially match one of the following:

`"equal"`

Use an equally weighted average.

`"proportional"`

Weight in proportion to the frequencies (in the original data) of the factor combinations that are averaged over.

`"outer"`

Weight in proportion to each individual factor's marginal frequencies. Thus, the weights for a combination of factors are the outer product of the one-factor margins

`"cells"`

Weight according to the frequencies of the cells being averaged.

`"flat"`

Give equal weight to all cells with data, and ignore empty cells.

`"show.levels"`

This is a convenience feature for understanding what is being averaged over. Instead of a table of EMMs, this causes the function to return a table showing the levels that are averaged over, in the order that they appear.

Outer weights are like the 'expected' counts in a chi-square test of independence, and will yield the same results as those obtained by proportional averaging with one factor at a time. All except `"cells"` uses the same set of weights for each mean. In a model where the predicted values are the cell means, cell weights will yield the raw averages of the data for the factors involved. Using `"flat"` is similar to `"cells"`, except nonempty cells are weighted equally and empty cells are ignored.

## Offsets

Unlike in `ref_grid`, an offset need not be scalar. If not enough values are supplied, they are cyclically recycled. For a vector of offsets, it is important to understand that the ordering of results goes with the first name in `specs` varying fastest. If there are any `by` factors, those vary slower than all the primary ones, but the first `by` variable varies the fastest within that hierarchy. See the examples.

## Options and `...`

Arguments that could go in `options` may instead be included in `...`, typically, arguments such as `type`, `infer`, etc. that in essence are passed to `summary.emmGrid`. Arguments in both places are overridden by the ones in `...`.

There is a danger that `...` arguments could partially match those used by both `ref_grid` and `update.emmGrid`, creating a conflict. If these occur, usually they can be resolved by providing complete (or at least longer) argument names; or by isolating non-`ref_grid` arguments in `options`; or by calling `ref_grid` separately and passing the result as `object`. See a not-run example below.

Also, when `specs` is a two-sided formula, or `contr` is specified, there is potential confusion concerning which `...` arguments apply to the means, and which to the contrasts. When such confusion is possible, we suggest doing things separately (a call to `emmeans` with no contrasts, followed by a call to `contrast`). We do treat for `adjust` as a special case: it is applied to the `emmeans` results only if there are no contrasts specified, otherwise it is passed to `contrast`.

`ref_grid`, `contrast`, vignette("models", "emmeans")
 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35``` ```warp.lm <- lm(breaks ~ wool * tension, data = warpbreaks) emmeans (warp.lm, ~ wool | tension) # or equivalently emmeans(warp.lm, "wool", by = "tension") # 'adjust' argument ignored in emmeans, passed to contrast part... emmeans (warp.lm, poly ~ tension | wool, adjust = "sidak") ## Not run: # 'adjust' argument NOT ignored ... emmeans (warp.lm, ~ tension | wool, adjust = "sidak") ## End(Not run) ## Not run: ### Offsets: Consider a silly example: emmeans(warp.lm, ~ tension | wool, offset = c(17, 23, 47)) @ grid # note that offsets are recycled so that each level of tension receives # the same offset for each wool. # But using the same offsets with ~ wool | tension will probably not # be what you want because the ordering of combinations is different. ### Conflicting arguments... # This will error because 'tran' is passed to both ref_grid and update emmeans(some.model, "treatment", tran = "log", type = "response") # Use this if the response was a variable that is the log of some other variable # (Keep 'tran' from being passed to ref_grid) emmeans(some.model, "treatment", options = list(tran = "log"), type = "response") # This will re-grid the result as if the response had been log-transformed # ('transform' is passed only to ref_grid, not to update) emmeans(some.model, "treatment", transform = "log", type = "response") ## End(Not run) ```