tests: Tests/methods available in 'add_p()' and 'add_difference()'

testsR Documentation

Tests/methods available in add_p() and add_difference()

Description

Below is a listing of tests available internally within gtsummary.

Tests listed with ... may have additional arguments passed to them using add_p(test.args=). For example, to calculate a p-value from t.test() assuming equal variance, use tbl_summary(trial, by = trt) %>% add_p(age ~ "t.test", test.args = age ~ list(var.equal = TRUE))

tbl_summary() %>% add_p()

alias description pseudo-code details
't.test' t-test t.test(variable ~ as.factor(by), data = data, conf.level = 0.95, ...)
'aov' One-way ANOVA aov(variable ~ as.factor(by), data = data) %>% summary()
'mood.test' Mood two-sample test of scale mood.test(variable ~ as.factor(by), data = data, ...) Not to be confused with the Brown-Mood test of medians
'oneway.test' One-way ANOVA oneway.test(variable ~ as.factor(by), data = data, ...)
'kruskal.test' Kruskal-Wallis test kruskal.test(data[[variable]], as.factor(data[[by]]))
'wilcox.test' Wilcoxon rank-sum test wilcox.test(as.numeric(variable) ~ as.factor(by), data = data, ...)
'chisq.test' chi-square test of independence chisq.test(x = data[[variable]], y = as.factor(data[[by]]), ...)
'chisq.test.no.correct' chi-square test of independence chisq.test(x = data[[variable]], y = as.factor(data[[by]]), correct = FALSE)
'fisher.test' Fisher's exact test fisher.test(data[[variable]], as.factor(data[[by]]), conf.level = 0.95, ...)
'mcnemar.test' McNemar's test ⁠tidyr::pivot_wider(id_cols = group, ...); mcnemar.test(by_1, by_2, conf.level = 0.95, ...)⁠
'mcnemar.test.wide' McNemar's test mcnemar.test(data[[variable]], data[[by]], conf.level = 0.95, ...)
'lme4' random intercept logistic regression ⁠lme4::glmer(by ~ (1 \UFF5C group), data, family = binomial) %>% anova(lme4::glmer(by ~ variable + (1 \UFF5C group), data, family = binomial))⁠
'paired.t.test' Paired t-test ⁠tidyr::pivot_wider(id_cols = group, ...); t.test(by_1, by_2, paired = TRUE, conf.level = 0.95, ...)⁠
'paired.wilcox.test' Paired Wilcoxon rank-sum test ⁠tidyr::pivot_wider(id_cols = group, ...); wilcox.test(by_1, by_2, paired = TRUE, conf.int = TRUE, conf.level = 0.95, ...)⁠
'prop.test' Test for equality of proportions prop.test(x, n, conf.level = 0.95, ...)
'ancova' ANCOVA lm(variable ~ by + adj.vars)
'emmeans' Estimated Marginal Means or LS-means lm(variable ~ by + adj.vars, data) %>% emmeans::emmeans(specs =~by) %>% emmeans::contrast(method = "pairwise") %>% summary(infer = TRUE, level = conf.level) When variable is binary, glm(family = binomial) and emmeans(regrid = "response") arguments are used. When group is specified, lme4::lmer() and lme4::glmer() are used with the group as a random intercept.

tbl_svysummary() %>% add_p()

alias description pseudo-code details
'svy.t.test' t-test adapted to complex survey samples survey::svyttest(~variable + by, data)
'svy.wilcox.test' Wilcoxon rank-sum test for complex survey samples survey::svyranktest(~variable + by, data, test = 'wilcoxon')
'svy.kruskal.test' Kruskal-Wallis rank-sum test for complex survey samples survey::svyranktest(~variable + by, data, test = 'KruskalWallis')
'svy.vanderwaerden.test' van der Waerden's normal-scores test for complex survey samples survey::svyranktest(~variable + by, data, test = 'vanderWaerden')
'svy.median.test' Mood's test for the median for complex survey samples survey::svyranktest(~variable + by, data, test = 'median')
'svy.chisq.test' chi-squared test with Rao & Scott's second-order correction survey::svychisq(~variable + by, data, statistic = 'F')
'svy.adj.chisq.test' chi-squared test adjusted by a design effect estimate survey::svychisq(~variable + by, data, statistic = 'Chisq')
'svy.wald.test' Wald test of independence for complex survey samples survey::svychisq(~variable + by, data, statistic = 'Wald')
'svy.adj.wald.test' adjusted Wald test of independence for complex survey samples survey::svychisq(~variable + by, data, statistic = 'adjWald')
'svy.lincom.test' test of independence using the exact asymptotic distribution for complex survey samples survey::svychisq(~variable + by, data, statistic = 'lincom')
'svy.saddlepoint.test' test of independence using a saddlepoint approximation for complex survey samples survey::svychisq(~variable + by, data, statistic = 'saddlepoint')
'emmeans' Estimated Marginal Means or LS-means survey::svyglm(variable ~ by + adj.vars, data) %>% emmeans::emmeans(specs =~by) %>% emmeans::contrast(method = "pairwise") %>% summary(infer = TRUE, level = conf.level) When variable is binary, survey::svyglm(family = binomial) and emmeans(regrid = "response") arguments are used.

tbl_survfit() %>% add_p()

alias description pseudo-code
'logrank' Log-rank test survival::survdiff(Surv(.) ~ variable, data, rho = 0)
'tarone' Tarone-Ware test survival::survdiff(Surv(.) ~ variable, data, rho = 1.5)
'petopeto_gehanwilcoxon' Peto & Peto modification of Gehan-Wilcoxon test survival::survdiff(Surv(.) ~ variable, data, rho = 1)
'survdiff' G-rho family test survival::survdiff(Surv(.) ~ variable, data, ...)
'coxph_lrt' Cox regression (LRT) survival::coxph(Surv(.) ~ variable, data, ...)
'coxph_wald' Cox regression (Wald) survival::coxph(Surv(.) ~ variable, data, ...)
'coxph_score' Cox regression (Score) survival::coxph(Surv(.) ~ variable, data, ...)

tbl_continuous() %>% add_p()

alias description pseudo-code
'anova_2way' Two-way ANOVA lm(continuous_variable ~ by + variable)
't.test' t-test t.test(continuous_variable ~ as.factor(variable), data = data, conf.level = 0.95, ...)
'aov' One-way ANOVA aov(continuous_variable ~ as.factor(variable), data = data) %>% summary()
'kruskal.test' Kruskal-Wallis test kruskal.test(data[[continuous_variable]], as.factor(data[[variable]]))
'wilcox.test' Wilcoxon rank-sum test wilcox.test(as.numeric(continuous_variable) ~ as.factor(variable), data = data, ...)
'lme4' random intercept logistic regression ⁠lme4::glmer(by ~ (1 \UFF5C group), data, family = binomial) %>% anova(lme4::glmer(variable ~ continuous_variable + (1 \UFF5C group), data, family = binomial))⁠
'ancova' ANCOVA lm(continuous_variable ~ variable + adj.vars)

tbl_summary() %>% add_difference()

alias description difference statistic pseudo-code details
't.test' t-test mean difference t.test(variable ~ as.factor(by), data = data, conf.level = 0.95, ...)
'paired.t.test' Paired t-test mean difference ⁠tidyr::pivot_wider(id_cols = group, ...); t.test(by_1, by_2, paired = TRUE, conf.level = 0.95, ...)⁠
'prop.test' Test for equality of proportions rate difference prop.test(x, n, conf.level = 0.95, ...)
'ancova' ANCOVA mean difference lm(variable ~ by + adj.vars)
'ancova_lme4' ANCOVA with random intercept mean difference ⁠lme4::lmer(variable ~ by + adj.vars + (1 \UFF5C group), data)⁠
'cohens_d' Cohen's D standardized mean difference effectsize::cohens_d(variable ~ by, data, ci = conf.level, ...)
'smd' Standardized Mean Difference standardized mean difference smd::smd(x = data[[variable]], g = data[[by]], std.error = TRUE)
'emmeans' Estimated Marginal Means or LS-means adjusted mean difference lm(variable ~ by + adj.vars, data) %>% emmeans::emmeans(specs =~by) %>% emmeans::contrast(method = "pairwise") %>% summary(infer = TRUE, level = conf.level) When variable is binary, glm(family = binomial) and emmeans(regrid = "response") arguments are used. When group is specified, lme4::lmer() and lme4::glmer() are used with the group as a random intercept.

tbl_svysummary() %>% add_difference()

alias description difference statistic pseudo-code details
'smd' Standardized Mean Difference standardized mean difference smd::smd(x = data$variables[[variable]], g = data$variables[[by]], w = weights(data), std.error = TRUE)
'emmeans' Estimated Marginal Means or LS-means adjusted mean difference survey::svyglm(variable ~ by + adj.vars, data) %>% emmeans::emmeans(specs =~by) %>% emmeans::contrast(method = "pairwise") %>% summary(infer = TRUE, level = conf.level) When variable is binary, survey::svyglm(family = binomial) and emmeans(regrid = "response") arguments are used.

Custom Functions

To report a p-value (or difference) for a test not available in gtsummary, you can create a custom function. The output is a data frame that is one line long. The structure is similar to the output of broom::tidy() of a typical statistical test. The add_p() and add_comparison() functions will look for columns called "p.value", "estimate", "conf.low", "conf.high", and "method" for the p-value, difference, confidence interval, and the test name used in the footnote.

Example calculating a p-value from a t-test assuming a common variance between groups.

ttest_common_variance <- function(data, variable, by, ...) {
  data <- data[c(variable, by)] %>% dplyr::filter(complete.cases(.))
  t.test(data[[variable]] ~ factor(data[[by]]), var.equal = TRUE) %>%
  broom::tidy()
}

trial[c("age", "trt")] %>%
  tbl_summary(by = trt) %>%
  add_p(test = age ~ "ttest_common_variance")

A custom add_difference() is similar, and accepts arguments ⁠conf.level=⁠ and ⁠adj.vars=⁠ as well.

ttest_common_variance <- function(data, variable, by, conf.level, ...) {
  data <- data[c(variable, by)] %>% dplyr::filter(complete.cases(.))
  t.test(data[[variable]] ~ factor(data[[by]]), conf.level = conf.level, var.equal = TRUE) %>%
  broom::tidy()
}

Function Arguments

For tbl_summary() objects, the custom function will be passed the following arguments: custom_pvalue_fun(data=, variable=, by=, group=, type=, conf.level=, adj.vars=). While your function may not utilize each of these arguments, these arguments are passed and the function must accept them. We recommend including ... to future-proof against updates where additional arguments are added.

The following table describes the argument inputs for each gtsummary table type.

argument tbl_summary tbl_svysummary tbl_survfit tbl_continuous
⁠data=⁠ A data frame A survey object A survfit() object A data frame
⁠variable=⁠ String variable name String variable name NA String variable name
⁠by=⁠ String variable name String variable name NA String variable name
⁠group=⁠ String variable name NA NA String variable name
⁠type=⁠ Summary type Summary type NA NA
⁠conf.level=⁠ Confidence interval level NA NA NA
⁠adj.vars=⁠ Character vector of adjustment variable names (e.g. used in ANCOVA) NA NA Character vector of adjustment variable names (e.g. used in ANCOVA)
⁠continuous_variable=⁠ NA NA NA String of the continuous variable name

gtsummary documentation built on July 26, 2023, 5:27 p.m.