pairwise | R Documentation |
This function is useful for generating and testing all pairwise comparisons of categorical terms
in a linear model. This can be done in base R using functions like pairwise.t.test
and
TukeyHSD
, but these functions are inconsistent both in their output format and their general
approach to pairwise comparisons. pairwise()
will return a consistent table format, and will
make consistent decisions about how to calculate error terms and confidence intervals. See the
Details section low for more on how the models are tested (and why your output might not
match other functions).
pairwise(
fit,
correction = "Tukey",
term = NULL,
alpha = 0.05,
var_equal = TRUE,
plot = FALSE
)
pairwise_t(fit, term = NULL, alpha = 0.05, correction = "none")
pairwise_bonferroni(fit, term = NULL, alpha = 0.05)
pairwise_tukey(fit, term = NULL, alpha = 0.05)
fit |
A model fit by |
correction |
The type of correction (if any) to perform to maintain the family-wise
error-rate specified by |
term |
If |
alpha |
The family-wise error-rate to restrict the tests to. If "none" is given for
|
var_equal |
If |
plot |
Setting plot to TRUE will automatically call |
For simple one-way models where a single categorical variable predicts and outcome, you will get
output similar to other methods of computing pairwise comparisons. Essentially, the differences
on the outcome between each of the groups defined by the categorical variable are compared with
the requested test, and their confidence intervals and p-values are adjusted by the requested
correction
.
However, when more than two variables are entered into the model, the outcome will diverge somewhat from other methods of computing pairwise comparisons. For traditional pairwise tests you need to estimate an error term, usually by pooling the standard deviation of the groups being compared. This means that when you have other predictors in the model, their presence is ignored when running these tests. For the functions in this package, we instead compute the pooled standard error by using the mean squared error (MSE) from the full model fit.
Let's take a concrete example to explain that. If we are predicting a car's miles-per-gallon
(mpg
) based on whether it has an automatic or manual transmission (am
), we can create that
linear model and get the pairwise comparisons like this:
pairwise(lm(mpg ~ factor(am), data = mtcars))
The output of this code will have one table showing the comparison of manual and automatic transmissions with regard to miles-per-gallon. The pooled standard error is the same as the square root of the MSE from the full model.
In these data the am
variable did not have any other values than automatic and manual, but
we can imagine situations where the predictor has more than two levels. In these cases, the
pooled SD would be calculated by taking the MSE of the full model (not of each group) and then
weighting it based on the size of the groups in question (divide by n).
To improve our model, we might add the car's displacement (disp
) as a quantitative predictor:
pairwise(lm(mpg ~ factor(am) + disp, data = mtcars))
Note that the output still only has a table for am
. This is because we can't do a pairwise
comparison using disp
because there are no groups to compare. Most functions will drop or not
let you use this variable during pairwise comparisons. Instead, pairwise()
uses the same
approach as in the 3+ groups situation: we use the MSE for the full model and then weight it by
the size of the groups being compared. Because we are using the MSE for the full model, the
effect of disp
is accounted for in the error term even though we are not explicitly comparing
different displacements. Importantly, the interpretation of the outcome is different than in
other traditional t-tests. Instead of saying, "there is a difference in miles-per-gallon based
on the type of transmission," we must add that this difference is found "after accounting for
displacement."
A list of tables organized by the terms in the model. For each term (categorical terms only, as splitting on a continuous variable is generally uninformative), the table describes all of the pairwise-comparisons possible.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.