View source: R/stat-multcomp.R
stat_multcomp | R Documentation |
stat_multcomp
fits a linear model by default with stats::lm()
but alternatively using other model fit functions. The model is passed to
function glht()
from package 'multcomp' to fit Tukey or Dunnet
contrasts and generates labels based on adjusted P-values.
stat_multcomp(
mapping = NULL,
data = NULL,
geom = NULL,
position = "identity",
...,
formula = NULL,
method = "lm",
method.args = list(),
contrast.type = "Tukey",
adjusted.type = "single-step",
small.p = FALSE,
p.digits = 3,
label.type = "bars",
fm.cutoff.p.value = 1,
mc.cutoff.p.value = 1,
mc.critical.p.value = 0.05,
label.y = NULL,
vstep = NULL,
output.type = NULL,
na.rm = FALSE,
orientation = NA,
parse = NULL,
show.legend = FALSE,
inherit.aes = TRUE
)
mapping |
The aesthetic mapping, usually constructed with
|
data |
A layer specific dataset, only needed if you want to override the plot defaults. |
geom |
The geometric object to use to display the data. |
position |
The position adjustment to use for overlapping points on this layer. |
... |
other arguments passed on to |
formula |
a formula object. Using aesthetic names |
method |
function or character If character, "lm" (or its equivalent
"aov"), "rlm" or the name of a model fit function are accepted, possibly
followed by the fit function's |
method.args |
named list with additional arguments. |
contrast.type |
character One of "Tukey" or "Dunnet". |
adjusted.type |
character As the argument for parameter |
small.p |
logical If true, use of lower case p instead of capital P as the symbol for P-value in labels. |
p.digits |
integer Number of digits after the decimal point to
use for |
label.type |
character One of "bars", "letters" or "LETTERS", selects
how the results of the multiple comparisons are displayed. Only "bars" can
be used together with |
fm.cutoff.p.value |
numeric [0..1] The P-value for the main effect of
factor |
mc.cutoff.p.value |
numeric [0..1] The P-value for the individual contrasts above which no labelled bars are generated. Default is 1, labelling all pairwise contrasts tested. |
mc.critical.p.value |
numeric The critical P-value used for tests when when encoded as letters. |
label.y |
numeric vector Values in native data units or if
|
vstep |
numeric in npc units, the horizontal displacement step-size
used between labels for different contrasts when |
output.type |
character One of "expression", "LaTeX", "text", "markdown" or "numeric". |
na.rm |
a logical indicating whether NA values should be stripped before the computation proceeds. |
orientation |
character Either "x" or "y" controlling the default for
|
parse |
logical Passed to the geom. If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
This statistic can be used to automatically annotate a plot with
P-values for multiple comparison tests, based on Tukey contrasts
(all pairwise) or Dunnet contrasts (other levels against the first one).
See Meier (2022, Chapter 3) for an accessible explanation of multiple
comparisons and contrasts with package 'multcomp', of which
stat_multcomp()
is mostly a wrapper.
The explanatory variable mapped to the x aesthetic must be a factor as this creates the required grouping. Currently, arbitrary contrasts are not supported, mainly because they would be difficult to convert into plot annotations.
Two ways of displaying the outcomes are implemented, and are selected by '"bars"', '"letters"' or '"LETTERS"' as argument to parameter 'label.type'. '"letters"' and '"LETTERS"' can be used only with Tukey contrasts, as otherwise the encoding is ambiguous. As too many bars clutter a plot, the maximum number of factor levels supported for '"bars"' together with Tukey contrasts is five, and together with Dunnet contrasts, unlimited.
stat_multcomp()
by default generates character labels ready to be
parsed as R expressions but LaTeX (use TikZ device), markdown (use package
'ggtext') and plain text are also supported, as well as numeric values for
user-generated text labels. The value of parse
is set automatically
based on output.type
, but if you assemble labels that need parsing
from numeric
output, the default needs to be overridden. This
statistic only generates annotation labels and segments connecting the
compared factor levels, or letter labels that discriminate significantly
different groups.
A data frame with one row per comparison for label.type =
"bars"
, or a data frame with one row per factor x
level for
label.type = "letters"
and for label.type = "LETTERS"
.
Variables (= columns) as described under Computed variables.
stat_multcomp()
understands x
and
y
, to be referenced in the formula
and weight
passed
as argument to parameter weights
. A factor must be mapped to
x
and numeric
variables to y
, and, if used, to
weight
. In addition, the aesthetics understood by the geom
("label_pairwise"
is the default for label.type = "bars"
,
"text"
is the default for label.type = "letters"
and for
label.type = "LETTERS"
) are understood and grouping
respected.
If output.type = "numeric"
and
label.type = "bars"
the returned tibble contains
columns listed below. In all cases if the model fit function used does not return a value,
the label is set to character(0L)
and the numeric value to NA
.
x position, numeric.
y position, numeric.
Delta estimate from pairwise contrasts, numeric.
Contrasts as two levels' ordinal "numbers" separated by a dash, character.
t-statistic estimates for the pairwise contrasts, numeric.
P-value for the pairwise contrasts.
Set according method
used.
Most derived class of the fitted model object.
Formula extracted from the fitted model object if available, or the formula argument.
Formula extracted from the fitted model object if available, or the formula argument, formatted as character.
The method used to adjust the P-values.
The type of contrast used for multiple comparisons.
The total number of observations or rows in data.
text label, always included, but possibly NA.
If output.type is not "numeric"
the returned data frame includes in
addition the following labels:
P-value for the pairwise contrasts encoded as "starts", character.
P-value for the pairwise contrasts, character.
The coefficient or estimate for the difference between compared pairs of levels.
t-statistic estimates for the pairwise contrasts, character.
If label.type = "letters"
or label.type = "LETTERS"
the returned tibble contains
columns listed below.
x position, numeric.
y position, numeric.
P-value used in pairwise tests, numeric.
Set according method
used.
Most derived class of the fitted model object.
Formula extracted from the fitted model object if available, or the formula argument.
Formula extracted from the fitted model object if available, or the formula argument, formatted as character.
The method used to adjust the P-values.
The type of contrast used for multiple comparisons.
The total number of observations or rows in data.
text label, always included, but possibly NA.
If output.type is not "numeric"
the returned data frame includes in
addition the following labels:
Letters that distinguish levels based on significance from multiple comparisons test.
stat_signif()
in package 'ggsignif' is
an earlier and independent implementation of pairwise tests.
R option OutDec
is obeyed based on its value at the time the plot
is rendered, i.e., displayed or printed. Set options(OutDec = ",")
for languages like Spanish or French.
Meier, Lukas (2022) ANOVA and Mixed Models: A Short Introduction Using R. Chapter 3 Contrasts and Multiple Testing. The R Series. Boca Raton: Chapman and Hall/CRC. ISBN: 9780367704209, \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1201/9781003146216")}.
This statistic uses the implementation of Tests of General Linear
Hypotheses in function glht
. See
summary.glht
and p.adjust
for the supported and tests and the references therein for the theory
behind them.
p1 <- ggplot(mpg, aes(factor(cyl), hwy)) +
geom_boxplot(width = 0.33)
## labeleld bars
p1 +
stat_multcomp()
# test against a control, with first level being the control
# change order of factor levels in data to set the control group
p1 +
stat_multcomp(contrast.type = "Dunnet")
# different methods to adjust the contrasts
p1 +
stat_multcomp(adjusted.type = "bonferroni")
p1 +
stat_multcomp(adjusted.type = "holm")
p1 +
stat_multcomp(adjusted.type = "fdr")
# sometimes we need to expand the plotting area
p1 +
stat_multcomp(geom = "text_pairwise") +
scale_y_continuous(expand = expansion(mult = c(0.05, 0.10)))
# position of contrasts' bars (based on scale limits)
p1 +
stat_multcomp(label.y = "bottom")
p1 +
stat_multcomp(label.y = 11)
# use different labels: difference and P-value from hypothesis tests
p1 +
stat_multcomp(use_label(c("Delta", "P")),
size = 2.75)
# control smallest P-value displayed and number of digits
p1 +
stat_multcomp(p.digits = 4)
# label only significant differences
# but test and correct for all pairwise contrasts!
p1 +
stat_multcomp(mc.cutoff.p.value = 0.01)
## letters as labels for test results
p1 +
stat_multcomp(label.type = "letters")
# use capital letters
p1 +
stat_multcomp(label.type = "LETTERS")
# location
p1 +
stat_multcomp(label.type = "letters",
label.y = "top")
p1 +
stat_multcomp(label.type = "letters",
label.y = 0)
# stricter critical p-value than default used for test
p1 +
stat_multcomp(label.type = "letters",
mc.critical.p.value = 0.01)
# Inspecting the returned data using geom_debug()
# This provides a quick way of finding out the names of the variables that
# are available for mapping to aesthetics with after_stat().
gginnards.installed <- requireNamespace("gginnards", quietly = TRUE)
if (gginnards.installed)
library(gginnards)
if (gginnards.installed)
p1 +
stat_multcomp(label.type = "bars",
geom = "debug")
if (gginnards.installed)
p1 +
stat_multcomp(label.type = "letters",
geom = "debug")
if (gginnards.installed)
p1 +
stat_multcomp(label.type = "bars",
output.type = "numeric",
geom = "debug")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.