tests: Tests/methods available in 'add_p()' and 'add_difference()'

testsR Documentation

Tests/methods available in add_p() and add_difference()

Description

Below is a listing of tests available internally within gtsummary.

Tests listed with ... may have additional arguments passed to them using add_p(test.args=). For example, to calculate a p-value from t.test() assuming equal variance, use tbl_summary(trial, by = trt) %>% add_p(age ~ "t.test", test.args = age ~ list(var.equal = TRUE))

tbl_summary() %>% add_p()

alias description pseudo-code details
't.test' t-test t.test(variable ~ as.factor(by), data = data, conf.level = 0.95, ...)
'aov' One-way ANOVA aov(variable ~ as.factor(by), data = data) %>% summary()
'mood.test' Mood two-sample test of scale mood.test(variable ~ as.factor(by), data = data, ...) Not to be confused with the Brown-Mood test of medians
'oneway.test' One-way ANOVA oneway.test(variable ~ as.factor(by), data = data, ...)
'kruskal.test' Kruskal-Wallis test kruskal.test(data[[variable]], as.factor(data[[by]]))
'wilcox.test' Wilcoxon rank-sum test wilcox.test(as.numeric(variable) ~ as.factor(by), data = data, conf.int = TRUE, conf.level = conf.level, ...)
'chisq.test' chi-square test of independence chisq.test(x = data[[variable]], y = as.factor(data[[by]]), ...)
'chisq.test.no.correct' chi-square test of independence chisq.test(x = data[[variable]], y = as.factor(data[[by]]), correct = FALSE)
'fisher.test' Fisher's exact test fisher.test(data[[variable]], as.factor(data[[by]]), conf.level = 0.95, ...)
'mcnemar.test' McNemar's test ⁠tidyr::pivot_wider(id_cols = group, ...); mcnemar.test(by_1, by_2, conf.level = 0.95, ...)⁠
'mcnemar.test.wide' McNemar's test mcnemar.test(data[[variable]], data[[by]], conf.level = 0.95, ...)
'lme4' random intercept logistic regression ⁠lme4::glmer(by ~ (1 \UFF5C group), data, family = binomial) %>% anova(lme4::glmer(by ~ variable + (1 \UFF5C group), data, family = binomial))⁠
'paired.t.test' Paired t-test ⁠tidyr::pivot_wider(id_cols = group, ...); t.test(by_1, by_2, paired = TRUE, conf.level = 0.95, ...)⁠
'paired.wilcox.test' Paired Wilcoxon rank-sum test ⁠tidyr::pivot_wider(id_cols = group, ...); wilcox.test(by_1, by_2, paired = TRUE, conf.int = TRUE, conf.level = 0.95, ...)⁠
'prop.test' Test for equality of proportions prop.test(x, n, conf.level = 0.95, ...)
'ancova' ANCOVA lm(variable ~ by + adj.vars)
'emmeans' Estimated Marginal Means or LS-means lm(variable ~ by + adj.vars, data) %>% emmeans::emmeans(specs =~by) %>% emmeans::contrast(method = "pairwise") %>% summary(infer = TRUE, level = conf.level) When variable is binary, glm(family = binomial) and emmeans(regrid = "response") arguments are used. When group is specified, lme4::lmer() and lme4::glmer() are used with the group as a random intercept.

tbl_svysummary() %>% add_p()

alias description pseudo-code details
'svy.t.test' t-test adapted to complex survey samples survey::svyttest(~variable + by, data)
'svy.wilcox.test' Wilcoxon rank-sum test for complex survey samples survey::svyranktest(~variable + by, data, test = 'wilcoxon')
'svy.kruskal.test' Kruskal-Wallis rank-sum test for complex survey samples survey::svyranktest(~variable + by, data, test = 'KruskalWallis')
'svy.vanderwaerden.test' van der Waerden's normal-scores test for complex survey samples survey::svyranktest(~variable + by, data, test = 'vanderWaerden')
'svy.median.test' Mood's test for the median for complex survey samples survey::svyranktest(~variable + by, data, test = 'median')
'svy.chisq.test' chi-squared test with Rao & Scott's second-order correction survey::svychisq(~variable + by, data, statistic = 'F')
'svy.adj.chisq.test' chi-squared test adjusted by a design effect estimate survey::svychisq(~variable + by, data, statistic = 'Chisq')
'svy.wald.test' Wald test of independence for complex survey samples survey::svychisq(~variable + by, data, statistic = 'Wald')
'svy.adj.wald.test' adjusted Wald test of independence for complex survey samples survey::svychisq(~variable + by, data, statistic = 'adjWald')
'svy.lincom.test' test of independence using the exact asymptotic distribution for complex survey samples survey::svychisq(~variable + by, data, statistic = 'lincom')
'svy.saddlepoint.test' test of independence using a saddlepoint approximation for complex survey samples survey::svychisq(~variable + by, data, statistic = 'saddlepoint')
'emmeans' Estimated Marginal Means or LS-means survey::svyglm(variable ~ by + adj.vars, data) %>% emmeans::emmeans(specs =~by) %>% emmeans::contrast(method = "pairwise") %>% summary(infer = TRUE, level = conf.level) When variable is binary, survey::svyglm(family = binomial) and emmeans(regrid = "response") arguments are used.

tbl_survfit() %>% add_p()

alias description pseudo-code
'logrank' Log-rank test survival::survdiff(Surv(.) ~ variable, data, rho = 0)
'tarone' Tarone-Ware test survival::survdiff(Surv(.) ~ variable, data, rho = 1.5)
'petopeto_gehanwilcoxon' Peto & Peto modification of Gehan-Wilcoxon test survival::survdiff(Surv(.) ~ variable, data, rho = 1)
'survdiff' G-rho family test survival::survdiff(Surv(.) ~ variable, data, ...)
'coxph_lrt' Cox regression (LRT) survival::coxph(Surv(.) ~ variable, data, ...)
'coxph_wald' Cox regression (Wald) survival::coxph(Surv(.) ~ variable, data, ...)
'coxph_score' Cox regression (Score) survival::coxph(Surv(.) ~ variable, data, ...)

tbl_continuous() %>% add_p()

alias description pseudo-code
'anova_2way' Two-way ANOVA lm(continuous_variable ~ by + variable)
't.test' t-test t.test(continuous_variable ~ as.factor(variable), data = data, conf.level = 0.95, ...)
'aov' One-way ANOVA aov(continuous_variable ~ as.factor(variable), data = data) %>% summary()
'kruskal.test' Kruskal-Wallis test kruskal.test(data[[continuous_variable]], as.factor(data[[variable]]))
'wilcox.test' Wilcoxon rank-sum test wilcox.test(as.numeric(continuous_variable) ~ as.factor(variable), data = data, ...)
'lme4' random intercept logistic regression ⁠lme4::glmer(by ~ (1 \UFF5C group), data, family = binomial) %>% anova(lme4::glmer(variable ~ continuous_variable + (1 \UFF5C group), data, family = binomial))⁠
'ancova' ANCOVA lm(continuous_variable ~ variable + adj.vars)

tbl_summary() %>% add_difference()

alias description difference statistic pseudo-code details
't.test' t-test mean difference t.test(variable ~ as.factor(by), data = data, conf.level = 0.95, ...)
'wilcox.test' Wilcoxon rank-sum test wilcox.test(as.numeric(variable) ~ as.factor(by), data = data, conf.int = TRUE, conf.level = conf.level, ...)
'paired.t.test' Paired t-test mean difference ⁠tidyr::pivot_wider(id_cols = group, ...); t.test(by_1, by_2, paired = TRUE, conf.level = 0.95, ...)⁠
'prop.test' Test for equality of proportions rate difference prop.test(x, n, conf.level = 0.95, ...)
'ancova' ANCOVA mean difference lm(variable ~ by + adj.vars)
'ancova_lme4' ANCOVA with random intercept mean difference ⁠lme4::lmer(variable ~ by + adj.vars + (1 \UFF5C group), data)⁠
'cohens_d' Cohen's D standardized mean difference effectsize::cohens_d(variable ~ by, data, ci = conf.level, verbose = FALSE, ...)
'hedges_g' Hedge's G standardized mean difference effectsize::hedges_g(variable ~ by, data, ci = conf.level, verbose = FALSE, ...)
'paired_cohens_d' Paired Cohen's D standardized mean difference ⁠tidyr::pivot_wider(id_cols = group, ...); effectsize::cohens_d(by_1, by_2, paired = TRUE, conf.level = 0.95, verbose = FALSE, ...)⁠
'paired_hedges_g' Paired Hedge's G standardized mean difference ⁠tidyr::pivot_wider(id_cols = group, ...); effectsize::hedges_g(by_1, by_2, paired = TRUE, conf.level = 0.95, verbose = FALSE, ...)⁠
'smd' Standardized Mean Difference standardized mean difference smd::smd(x = data[[variable]], g = data[[by]], std.error = TRUE)
'emmeans' Estimated Marginal Means or LS-means adjusted mean difference lm(variable ~ by + adj.vars, data) %>% emmeans::emmeans(specs =~by) %>% emmeans::contrast(method = "pairwise") %>% summary(infer = TRUE, level = conf.level) When variable is binary, glm(family = binomial) and emmeans(regrid = "response") arguments are used. When group is specified, lme4::lmer() and lme4::glmer() are used with the group as a random intercept.

tbl_svysummary() %>% add_difference()

alias description difference statistic pseudo-code details
'smd' Standardized Mean Difference standardized mean difference smd::smd(x = variable, g = by, w = weights(data), std.error = TRUE)
'svy.t.test' t-test adapted to complex survey samples survey::svyttest(~variable + by, data)
'emmeans' Estimated Marginal Means or LS-means adjusted mean difference survey::svyglm(variable ~ by + adj.vars, data) %>% emmeans::emmeans(specs =~by) %>% emmeans::contrast(method = "pairwise") %>% summary(infer = TRUE, level = conf.level) When variable is binary, survey::svyglm(family = binomial) and emmeans(regrid = "response") arguments are used.

Custom Functions

To report a p-value (or difference) for a test not available in gtsummary, you can create a custom function. The output is a data frame that is one line long. The structure is similar to the output of broom::tidy() of a typical statistical test. The add_p() and add_difference() functions will look for columns called "p.value", "estimate", "statistic", "std.error", "parameter", "conf.low", "conf.high", and "method".

You can also pass an Analysis Results Dataset (ARD) object with the results for your custom result. These objects follow the structures outlined by the {cards} and {cardx} packages.

Example calculating a p-value from a t-test assuming a common variance between groups.

ttest_common_variance <- function(data, variable, by, ...) {
  data <- data[c(variable, by)] %>% dplyr::filter(complete.cases(.))
  t.test(data[[variable]] ~ factor(data[[by]]), var.equal = TRUE) %>%
  broom::tidy()
}

trial[c("age", "trt")] %>%
  tbl_summary(by = trt) %>%
  add_p(test = age ~ "ttest_common_variance")

A custom add_difference() is similar, and accepts arguments ⁠conf.level=⁠ and ⁠adj.vars=⁠ as well.

ttest_common_variance <- function(data, variable, by, conf.level, ...) {
  data <- data[c(variable, by)] %>% dplyr::filter(complete.cases(.))
  t.test(data[[variable]] ~ factor(data[[by]]), conf.level = conf.level, var.equal = TRUE) %>%
  broom::tidy()
}

Function Arguments

For tbl_summary() objects, the custom function will be passed the following arguments: custom_pvalue_fun(data=, variable=, by=, group=, type=, conf.level=, adj.vars=). While your function may not utilize each of these arguments, these arguments are passed and the function must accept them. We recommend including ... to future-proof against updates where additional arguments are added.

The following table describes the argument inputs for each gtsummary table type.

argument tbl_summary tbl_svysummary tbl_survfit tbl_continuous
⁠data=⁠ A data frame A survey object A survfit() object A data frame
⁠variable=⁠ String variable name String variable name NA String variable name
⁠by=⁠ String variable name String variable name NA String variable name
⁠group=⁠ String variable name NA NA String variable name
⁠type=⁠ Summary type Summary type NA NA
⁠conf.level=⁠ Confidence interval level NA NA NA
⁠adj.vars=⁠ Character vector of adjustment variable names (e.g. used in ANCOVA) NA NA Character vector of adjustment variable names (e.g. used in ANCOVA)
⁠continuous_variable=⁠ NA NA NA String of the continuous variable name

gtsummary documentation built on Sept. 11, 2024, 5:50 p.m.