tabmedians: Generate Summary Tables of Median Comparisons for Statistical...
In tab: Functions for Creating Summary Tables for Statistical Reports

Description Usage Arguments Details Value Note Author(s) References See Also Examples

This function compares the median of a continuous variable across levels of a categorical variable and summarizes the results in a clean table for a statistical report.

tabmedians(x, y, latex = FALSE, xlevels = NULL, yname = NULL, quantiles = NULL,
           quantile.vals = FALSE, parenth = "iqr", text.label = NULL,
           parenth.sep = "-", decimals = NULL, p.include = TRUE,
           p.decimals = c(2, 3), p.cuts = 0.01, p.lowerbound = 0.001,
           p.leading0 = TRUE, p.avoid1 = FALSE, overall.column = TRUE,
           n.column = FALSE, n.headings = TRUE, bold.colnames = TRUE,
           bold.varnames = FALSE, variable.colname = "Variable",
           print.html = FALSE, html.filename = "table1.html")

`x`	Vector of values for the categorical variable.
`y`	Vector of values for the continuous variable.
`latex`	If `TRUE`, object returned is formatted for printing in LaTeX using `xtable` [1]; if `FALSE`, formatted for copy-and-pasting from RStudio into a word processor.
`xlevels`	Optional character vector to label the levels of `x`, used in the column headings. If unspecified, the function uses the values that `x` takes on.
`yname`	Optional label for the `y` (row) variable. If unspecified, variable name of `y` is used.
`quantiles`	If specified, function compares medians of the `y` variable across quantiles of the `x` variable. For example, if `x` contains continuous BMI values and `y` contains continuous HDL cholesterol levels, setting `quantiles = 3` would result in median HDL being compared across tertiles of BMI.
`quantile.vals`	If `TRUE`, labels for `x` show quantile number and corresponding range of the `x` variable, e.g. Q1 [0.00, 0.25). If `FALSE`, labels for quantiles just show quantile number, e.g. Q1. Only used if `xlevels` is not specified.
`parenth`	Controls what values (if any) are placed in parentheses after the medians in each cell. Possible values are `"none"`, `"iqr"` for difference between first and third quartiles, `"range"` for difference between minimum and maximum, `"minmax"` for minimum and maximum, `"q1q3"` for first and third quartiles, and `"ci.90"`, `"ci.95"`, or `"ci.99"` for confidence intervals for the medians (based on binomial probabilities if one or more groups have n less than 10, otherwise based on normal approximation to binomial).
`text.label`	Optional text to put after the `y` variable name, identifying what cell values and parentheses indicate in the table. If unspecified, function uses default labels based on `parenth`, e.g. Median (IQR) if `parenth = "iqr"`. Set to `"none"` for no text labels.
`parenth.sep`	Optional character specifying the separator for the two numbers in parentheses when `parenth` is set to `"minmax"` or `"q1q3"`. The default is a dash, so values in the table are formatted as Median (Lower-Upper). If you set `parenth.sep = ", "` the values in the table will instead be formatted as Median (Lower, Upper).
`decimals`	Number of decimal places for values in table. If unspecified, function uses 0 decimal places if the largest median (in magnitude) is in [1,000, Inf), 1 decimal place if [10, 1,000), 2 decimal places if [0.1, 10), 3 decimal places if [0.01, 0.1), 4 decimal places if [0.001, 0.01), 5 decimal places if [0.0001, 0.001), and 6 decimal places if [0, 0.0001).
`p.include`	If `FALSE`, statistical test is not performed and p-value is not returned.
`p.decimals`	Number of decimal places for p-values. If a vector is provided rather than a single value, number of decimal places will depend on what range the p-value lies in. See `p.cuts`.
`p.cuts`	Cut-point(s) to control number of decimal places used for p-values. For example, by default `p.cuts = 0.1` and `p.decimals = c(2, 3)`. This means that p-values in the range [0.1, 1] will be printed to two decimal places, while p-values in the range [0, 0.1) will be printed to three decimal places.
`p.lowerbound`	Controls cut-point at which p-values are no longer printed as their value, but rather <lowerbound. For example, by default `p.lowerbound = 0.001`. Under this setting, p-values less than 0.001 are printed as `<0.001`.
`p.leading0`	If `TRUE`, p-values are printed with 0 before decimal place; if `FALSE`, the leading 0 is omitted.
`p.avoid1`	If `TRUE`, p-values rounded to 1 are not printed as 1, but as `>0.99` (or similarly depending on `p.decimals` and `p.cuts`).
`overall.column`	If `FALSE`, column showing median of `y` in full sample is suppressed.
`n.column`	If `TRUE`, the table will have a column for sample size.
`n.headings`	If `TRUE`, the table will indicate the sample size overall and in each group in parentheses after the column headings.
`bold.colnames`	If `TRUE`, column headings are printed in bold font. Only applies if `latex = TRUE`.
`bold.varnames`	If `TRUE`, variable name in the first column of the table is printed in bold font. Only applies if `latex = TRUE`.
`variable.colname`	Character string with desired heading for first column of table, which shows the `y` variable name.
`print.html`	If `TRUE`, function prints a .html file to the current working directory.
`html.filename`	Character string indicating the name of the .html file that gets printed if `print.html = TRUE`.

If x has two levels, a Mann-Whitney U (also known as Wilcoxon rank-sum) test is used to test whether the distribution of the continuous variable (y) differs in the two groups (x). If x has more than two levels, a Kruskal-Wallis test is used to test whether the distribution of y differs across at least two of the x groups.

Both x and y can have missing values. The function drops observations with missing x or y.

A character matrix with the requested table comparing median y across levels of x. If latex = TRUE, the character matrix will be formatted for inserting into a Markdown/Sweave/knitr report using the xtable package [1].

If you wish to paste your tables into Word, you can use either of these approaches:

1. Use the write.cb function in the Kmisc package [2]. If your table is stored in a character matrix named table1, use write.cb(table1) to copy the table to your clipboard. Paste the result into Word, then highlight the text and go to Insert - Table - Convert Text to Table... OK.

2. Set print.html = TRUE. This will result in a .html file writing to your current working directory. When you open this file, you will see a nice looking table that you can copy and paste into Word. You can control the name of this file with html.filename.

If you wish to use LaTeX, R Markdown, knitr, Sweave, etc., set latex = TRUE and then use xtable [1]. You may have to set sanitize.text.function = identity when calling print.xtable.

If you have suggestions for additional options or features, or if you would like some help using any function in tab, please e-mail me at vandomed@gmail.com. Thanks!

Dane R. Van Domelen

1. Dahl DB (2013). xtable: Export tables to LaTeX or HTML. R package version 1.7-1, https://cran.r-project.org/package=xtable.

2. Kevin Ushey (2013). Kmisc: Kevin Miscellaneous. R package version 0.5.0. https://CRAN.R-project.org/package=Kmisc.

Acknowledgment: This material is based upon work supported by the National Science Foundation Graduate Research Fellowship under Grant No. DGE-0940903.

tabfreq
tabmeans
tabmulti
tabglm
tabcox
tabgee
tabfreq.svy
tabmeans.svy
tabmedians.svy
tabmulti.svy
tabglm.svy

# Load in sample dataset d and drop rows with missing values
data(d)
d <- d[complete.cases(d), ]

# Create labels for group and race
groups <- c("Control", "Treatment")
races <- c("White", "Black", "Mexican American", "Other")

# Compare median BMI in control group vs. treatment group
medtable1 <- tabmedians(x = d$Group, y = d$BMI)

# Repeat, but show first and third quartile rather than IQR in parentheses
medtable2 <- tabmedians(x = d$Group, y = d$BMI, parenth = "q1q3")

# Compare median BMI by race, suppressing overall column and (n = ) part of
# headings
medtable3 <- tabmedians(x = d$Race, y = d$BMI, overall.column = FALSE,
                        n.headings = FALSE)

# Compare median BMI by quartile of age
medtable4 <- tabmedians(x = d$Age, y = d$BMI, quantiles = 4)

# Create single table comparing median BMI and median age in control vs.
# treatment group
medtable5 <- rbind(tabmedians(x = d$Group, y = d$BMI),
                   tabmedians(x = d$Group, y = d$Age))

# A (usually) faster way to make the above table is to call the tabmulti
# function
medtable6 <- tabmulti(dataset = d, xvarname = "Group",
                      yvarnames = c("BMI", "Age"), ymeasures = "median")

# medtable5 and medtable6 are equivalent
all(medtable5 == medtable6)