View source: R/mann_whitney_test.R
mann_whitney_test | R Documentation |
This function performs a Mann-Whitney test (or Wilcoxon rank
sum test for unpaired samples). Unlike the underlying base R function
wilcox.test()
, this function allows for weighted tests and automatically
calculates effect sizes. For paired (dependent) samples, or for one-sample
tests, please use the wilcoxon_test()
function.
A Mann-Whitney test is a non-parametric test for the null hypothesis that two
independent samples have identical continuous distributions. It can be used
for ordinal scales or when the two continuous variables are not normally
distributed. For large samples, or approximately normally distributed variables,
the t_test()
function can be used.
mann_whitney_test(
data,
select = NULL,
by = NULL,
weights = NULL,
mu = 0,
alternative = "two.sided",
...
)
data |
A data frame. |
select |
Name(s) of the continuous variable(s) (as character vector)
to be used as samples for the test.
|
by |
Name of the variable indicating the groups. Required if |
weights |
Name of an (optional) weighting variable to be used for the test. |
mu |
The hypothesized difference in means (for |
alternative |
A character string specifying the alternative hypothesis,
must be one of |
... |
Additional arguments passed to |
This function is based on wilcox.test()
and coin::wilcox_test()
(the latter to extract effect sizes). The weighted version of the test is
based on survey::svyranktest()
.
Interpretation of the effect size r, as a rule-of-thumb:
small effect >= 0.1
medium effect >= 0.3
large effect >= 0.5
r is calcuated as r = \frac{|Z|}{\sqrt{n1 + n2}}
.
A data frame with test results. The function returns p and Z-values as well as effect size r and group-rank-means.
The following table provides an overview of which test to use for different types of data. The choice of test depends on the scale of the outcome variable and the number of samples to compare.
Samples | Scale of Outcome | Significance Test |
1 | binary / nominal | chi_squared_test() |
1 | continuous, not normal | wilcoxon_test() |
1 | continuous, normal | t_test() |
2, independent | binary / nominal | chi_squared_test() |
2, independent | continuous, not normal | mann_whitney_test() |
2, independent | continuous, normal | t_test() |
2, dependent | binary (only 2x2) | chi_squared_test(paired=TRUE) |
2, dependent | continuous, not normal | wilcoxon_test() |
2, dependent | continuous, normal | t_test(paired=TRUE) |
>2, independent | continuous, not normal | kruskal_wallis_test() |
>2, independent | continuous, normal | datawizard::means_by_group() |
>2, dependent | continuous, not normal | not yet implemented (1) |
>2, dependent | continuous, normal | not yet implemented (2) |
(1) More than two dependent samples are considered as repeated measurements.
For ordinal or not-normally distributed outcomes, these samples are
usually tested using a friedman.test()
, which requires the samples
in one variable, the groups to compare in another variable, and a third
variable indicating the repeated measurements (subject IDs).
(2) More than two dependent samples are considered as repeated measurements. For normally distributed outcomes, these samples are usually tested using a ANOVA for repeated measurements. A more sophisticated approach would be using a linear mixed model.
Ben-Shachar, M.S., Patil, I., Thériault, R., Wiernik, B.M., Lüdecke, D. (2023). Phi, Fei, Fo, Fum: Effect Sizes for Categorical Data That Use the Chi‑Squared Statistic. Mathematics, 11, 1982. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.3390/math11091982")}
Bender, R., Lange, S., Ziegler, A. Wichtige Signifikanztests. Dtsch Med Wochenschr 2007; 132: e24–e25
du Prel, J.B., Röhrig, B., Hommel, G., Blettner, M. Auswahl statistischer Testverfahren. Dtsch Arztebl Int 2010; 107(19): 343–8
t_test()
for parametric t-tests of dependent and independent samples.
mann_whitney_test()
for non-parametric tests of unpaired (independent)
samples.
wilcoxon_test()
for Wilcoxon rank sum tests for non-parametric tests
of paired (dependent) samples.
kruskal_wallis_test()
for non-parametric tests with more than two
independent samples.
chi_squared_test()
for chi-squared tests (two categorical variables,
dependent and independent).
data(efc)
# Mann-Whitney-U tests for elder's age by elder's sex.
mann_whitney_test(efc, "e17age", by = "e16sex")
# base R equivalent
wilcox.test(e17age ~ e16sex, data = efc)
# when data is in wide-format, specify all relevant continuous
# variables in `select` and omit `by`
set.seed(123)
wide_data <- data.frame(scale1 = runif(20), scale2 = runif(20))
mann_whitney_test(wide_data, select = c("scale1", "scale2"))
# base R equivalent
wilcox.test(wide_data$scale1, wide_data$scale2)
# same as if we had data in long format, with grouping variable
long_data <- data.frame(
scales = c(wide_data$scale1, wide_data$scale2),
groups = as.factor(rep(c("A", "B"), each = 20))
)
mann_whitney_test(long_data, select = "scales", by = "groups")
# base R equivalent
wilcox.test(scales ~ groups, long_data)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.