rm_compactsum: Output a compact summary table

View source: R/rm_compactsum.R

rm_compactsumR Documentation

Output a compact summary table

Description

Outputs a table formatted for pdf, word or html output with summary statistics

Usage

rm_compactsum(
  data,
  xvars,
  grp,
  use_mean,
  caption = NULL,
  tableOnly = FALSE,
  covTitle = "",
  digits = 1,
  digits.cat = 0,
  nicenames = TRUE,
  iqr = TRUE,
  all.stats = FALSE,
  pvalue = TRUE,
  effSize = FALSE,
  p.adjust = "none",
  unformattedp = FALSE,
  show.sumstats = FALSE,
  show.tests = FALSE,
  full = TRUE,
  percentage = "col"
)

Arguments

data

dataframe containing data

xvars

character vector with the names of covariates to include in table

grp

character with the name of the grouping variable

use_mean

logical indicating whether mean and standard deviation will be returned for continuous variables instead of median. Otherwise, can specify for individual variables using a character vector containing the names of covariates to return mean and sd for (if use_mean is not supplied, all covariates will have median summaries). See examples.

caption

character containing table caption (default is no caption)

tableOnly

logical, if TRUE then a dataframe is returned, otherwise a formatted printed object is returned (default is FALSE)

covTitle

character with the name of the covariate (predictor) column. The default is to leave this empty for output or, for table only output to use the column name 'Covariate'

digits

numeric specifying the number of digits for summarizing mean data. Digits can be specified for individual variables using a named vector in the format digits=c("var1"=2,"var2"=3). If a variable is not in the vector the default will be used for it (default is 1). See examples

digits.cat

numeric specifying the number of digits for the proportions when summarizing categorical data (default is 0)

nicenames

logical indicating if you want to replace . and _ in strings . with a space

iqr

logical indicating if you want to display the interquartile range (Q1-Q3) as opposed to (min-max) in the summary for continuous variables

all.stats

logical indicating if all summary statistics (Q1, Q3 + min, max on a separate line) should be displayed. Overrides iqr

pvalue

logical indicating if you want p-values included in the table

effSize

logical indicating if you want effect sizes and their 95% confidence intervals included in the table. Effect sizes calculated include Cramer's V for categorical variables, and Cohen's d, Wilcoxon r, Epsilon-squared, or Omega-squared for numeric/continuous variables

p.adjust

p-adjustments to be performed

unformattedp

logical indicating if you would like the p-value to be returned unformatted (ie. not rounded or prefixed with '<'). Best used with tableOnly = T and outTable function. See examples

show.sumstats

logical indicating if the type of statistical summary (mean, median, etc) used should be shown.

show.tests

logical indicating if the type of statistical test and effect size (if effSize = TRUE) used should be shown in a column beside the p-values.

full

logical indicating if you want the full sample included in the table, ignored if grp is not specified

percentage

choice of how percentages are presented, either column (default) or row

Details

Comparisons for categorical variables default to chi-square tests, but if there are counts of <5 then the Fisher Exact test will be used. For grouping variables with two levels, either t-tests (mean) or wilcoxon tests (median) will be used for numerical variables. Otherwise, ANOVA (mean) or Kruskal- Wallis tests will be used. The statistical test used can be displayed by specifying show.tests = TRUE. Statistical tests and effect sizes for grp and/ or xvars with less than 2 counts in any level will not be shown.

Effect sizes are calculated as Cohen d for between group differences if the variable is summarised with the mean, otherwise Wilcoxon R if summarised with a median. Cramer's V is used for categorical variables, omega is used for differences in means among more than two groups and epsilon for differences in medians among more than two groups. Confidence intervals are calculated using bootstrapping.

tidyselect can only be used for xvars and grp arguments. Additional arguments (digits, use_mean) must be passed in using characters if variable names are used.

Value

A character vector of the table source code, unless tableOnly = TRUE in which case a data frame is returned. The output has the following attribute:

  • "description", which describes what is included in the output table and the type of statistical summary for each covariate. When applicable, the types of statistical tests used will be included. If effSize = TRUE, the effect sizes for each covariate will also be mentioned.

References

Smithson, M. (2002). Noncentral Confidence Intervals for Standardized Effect Sizes. (07/140 ed., Vol. 140). SAGE Publications. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.4135/9781412983761.n4")}

Steiger, J. H. (2004). Beyond the F Test: Effect Size Confidence Intervals and Tests of Close Fit in the Analysis of Variance and Contrast Analysis. Psychological Methods, 9(2), 164–182. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1037/1082-989X.9.2.164")}

Kelley, T. L. (1935). An Unbiased Correlation Ratio Measure. Proceedings of the National Academy of Sciences - PNAS, 21(9), 554–559. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1073/pnas.21.9.554")}

Okada, K. (2013). Is Omega Squared Less Biased? A Comparison of Three Major Effect Size Indices in One-Way ANOVA. Behavior Research Methods, 40(2), 129-147.

Breslow, N. (1970). A generalized Kruskal-Wallis test for comparing K samples subject to unequal patterns of censorship. Biometrika, 57(3), 579-594.

FRITZ, C. O., MORRIS, P. E., & RICHLER, J. J. (2012). Effect Size Estimates: Current Use, Calculations, and Interpretation. Journal of Experimental Psychology. General, 141(1), 2–18. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1037/a0024338")}

Examples

data("pembrolizumab")
rm_compactsum(data = pembrolizumab, xvars = c("age",
"change_ctdna_group", "l_size", "pdl1"), grp = "sex", use_mean = "age",
digits = c("age" = 2, "l_size" = 3), digits.cat = 1, iqr = TRUE,
show.tests = TRUE)

# Other Examples (not run)
## Include the summary statistic in the variable column
#rm_compactsum(data = pembrolizumab, xvars = c("age",
#"change_ctdna_group"), grp = "sex", use_mean = "age", show.sumstats=TRUE)

## To show effect sizes
#rm_compactsum(data = pembrolizumab, xvars = c("age",
#"change_ctdna_group"), grp = "sex", use_mean = "age", digits = 2,
#effSize = TRUE, show.tests = TRUE)

## To return unformatted p-values
#rm_compactsum(data = pembrolizumab, xvars = c("l_size",
#"change_ctdna_group"), grp = "cohort", effSize = TRUE, unformattedp = TRUE)

## Using tidyselect
#pembrolizumab |> rm_compactsum(xvars = c(age, sex, pdl1), grp = cohort,
#effSize = TRUE)


reportRmd documentation built on April 4, 2025, 2:03 a.m.