# Generate Summary Tables of Mean Comparisons for Statistical Reports

### Description

This function compares the mean of a continuous variable across levels of a categorical variable and summarizes the results in a clean table (or figure) for a statistical report.

### Usage

1 2 3 4 5 6 7 8 | ```
tabmeans(x, y, latex = FALSE, variance = "unequal", xname = NULL, xlevels = NULL,
yname = NULL, quantiles = NULL, quantile.vals = FALSE, parenth = "sd",
text.label = NULL, parenth.sep = "-", decimals = NULL, p.include = TRUE,
p.decimals = c(2, 3), p.cuts = 0.01, p.lowerbound = 0.001, p.leading0 = TRUE,
p.avoid1 = FALSE, overall.column = TRUE, n.column = FALSE, n.headings = TRUE,
bold.colnames = TRUE, bold.varnames = FALSE, variable.colname = "Variable",
fig = FALSE, fig.errorbars = "z.ci", fig.title = NULL, print.html = FALSE,
html.filename = "table1.html")
``` |

### Arguments

`x` |
Vector of values for the categorical x variable. |

`y` |
Vector of values for the continuous y variable. |

`latex` |
If TRUE, object returned is formatted for printing in LaTeX using xtable [1]; if FALSE, formatted for copy-and-pasting from RStudio into a word processor. |

`variance` |
Controls whether equal variance t-test or unequal variance t-test is used when x has two levels. Possible values are "equal" for equal variance, "unequal" for unequal variance, or "ftest" for F test to determine which version of the t-test to use. Note that unequal variance t-test is less restrictive than equal variance t-test, and the F test is only valid when y is normally distributed in both x groups. |

`xname` |
Label for the categorical variable. Only used if fig is TRUE. |

`xlevels` |
Optional character vector to label the levels of x, used in the column headings. If unspecified, the function uses the values that x takes on. |

`yname` |
Optional label for the continuous y variable. If unspecified, variable name of y is used. |

`quantiles` |
If specified, function compares means of the y variable across quantiles of the x variable. For example, if x contains continuous BMI values and y contains continuous HDL cholesterol levels, setting quantiles to 3 would result in mean HDL being compared across tertiles of BMI. |

`quantile.vals` |
If TRUE, labels for x show quantile number and corresponding range of the x variable. For example, Q1 [0.00, 0.25). If FALSE, labels for quantiles just show quantile number (e.g. Q1). Only used if xlevels is not specified. |

`parenth` |
Controls what values (if any) are placed in parentheses after the means in each cell. Possible values are "none", "sd" for standard deviation, "se" for standard error, "t.ci" for 95% confidence interval for population mean based on t distribution, and "z.ci" for 95% confidence interval for population mean based on z distribution. |

`text.label` |
Optional text to put after the y variable name, identifying what cell values and parentheses indicate in the table. If unspecified, function uses default labels based on parenth, e.g. M (SD) if parenth is "sd". Set to "none" for no text labels. |

`parenth.sep` |
Optional character specifying the separator between lower and upper bound of confidence interval (when requested). Usually either "-" or ", " depending on user preference. |

`decimals` |
Number of decimal places for means and standard deviations/standard errors/confidence intervals. If unspecified, function uses 0 decimal places if the largest mean (in magnitude) is in [1,000, Inf), 1 decimal place if [10, 1,000), 2 decimal places if [0.1, 10), 3 decimal places if [0.01, 0.1), 4 decimal places if [0.001, 0.01), 5 decimal places if [0.0001, 0.001), and 6 decimal places if [0, 0.0001). |

`p.include` |
If FALSE, t-test is not performed and p-value is not returned. |

`p.decimals` |
Number of decimal places for p-values. If a vector is provided rather than a single value, number of decimal places will depend on what range the p-value lies in. See p.cuts. |

`p.cuts` |
Cut-point(s) to control number of decimal places used for p-values. For example, by default p.cuts is 0.1 and p.decimals is c(2, 3). This means that p-values in the range [0.1, 1] will be printed to two decimal places, while p-values in the range [0, 0.1) will be printed to three decimal places. |

`p.lowerbound` |
Controls cut-point at which p-values are no longer printed as their value, but rather <lowerbound. For example, by default p.lowerbound is 0.001. Under this setting, p-values less than 0.001 are printed as <0.001. |

`p.leading0` |
If TRUE, p-values are printed with 0 before decimal place; if FALSE, the leading 0 is omitted. |

`p.avoid1` |
If TRUE, p-values rounded to 1 are not printed as 1, but as >0.99 (or similarly depending on values for p.decimals and p.cuts). |

`overall.column` |
If FALSE, column showing mean of y in full sample is suppressed. |

`n.column` |
If TRUE, the table will have a column for (unweighted) sample size. |

`n.headings` |
If TRUE, the table will indicate the (unweighted) sample size overall and in each group in parentheses after the column headings. |

`bold.colnames` |
If TRUE, column headings are printed in bold font. Only applies if latex = TRUE. |

`bold.varnames` |
If TRUE, variable name in the first column of the table is printed in bold font. Only applies if latex = TRUE. |

`variable.colname` |
Character string with desired heading for first column of table, which shows the y variable name. |

`fig` |
If TRUE, a figure is returned rather than a table. The figure shows mean (95 percent confidence interval) for each level of x. |

`fig.errorbars` |
Controls error bars around mean when fig is TRUE. Possible values are "sd" for +/- 1 standard deviation, "se" for +/- 1 standard error, "t.ci" for 95% confidence interval based on t distribution, "z.ci" for 95% confidence interval based on z distribution, and "none" for no error bars. |

`fig.title` |
Title of figure. If unspecified, title is set to "Mean yname by xname". |

`print.html` |
If TRUE, function prints a .html file to the current working directory. |

`html.filename` |
Character string indicating the name of the .html file that gets printed if print.html is set to TRUE. |

### Details

If x has two levels, a t-test is used to test for a difference in means. If x has more than two levels, a one-way analysis of variance is used to test for a difference in means across the groups.

Both x and y can have missing values. The function drops observations with missing x or y.

### Value

A character matrix with the requested table comparing mean y across levels of x. If latex is set to TRUE, the character matrix will be formatted for inserting into a Markdown/Sweave/knitr report using the xtable package [1].

### Note

If you wish to paste your tables into Word, you can use either of these approaches:

1. Use the write.cb function in the Kmisc package [2]. If your table is stored in a character matrix named table1, use write.cb(table1) to copy the table to your clipboard. Paste the result into Word, then highlight the text and go to Insert - Table - Convert Text to Table... OK.

2. Set the print.html input to TRUE. This will result in a .html file writing to your current working directory. When you open this file, you will see a nice looking table that you can copy and paste into Word. You can control the name of this file with the html.filename input.

If you wish to use LaTeX, R Markdown, knitr, Sweave, etc., please see the package vignette for examples. In most cases, you have to set the latex input to TRUE and then use the xtable package [1].

If you have suggestions for additional options or features, or if you would like some help using any function in the package tab, please e-mail me at vandomed@gmail.com. Thanks!

### Author(s)

Dane R. Van Domelen

### References

1. Dahl DB (2013). xtable: Export tables to LaTeX or HTML. R package version 1.7-1, https://cran.r-project.org/package=xtable.

2. Kevin Ushey (2013). Kmisc: Kevin Miscellaneous. R package version 0.5.0. https://CRAN.R-project.org/package=Kmisc.

Acknowledgment: This material is based upon work supported by the National Science Foundation Graduate Research Fellowship under Grant No. DGE-0940903.

### See Also

`tabfreq`

,
`tabmedians`

,
`tabmulti`

,
`tabglm`

,
`tabcox`

,
`tabgee`

,
`tabfreq.svy`

,
`tabmeans.svy`

,
`tabmedians.svy`

,
`tabmulti.svy`

,
`tabglm.svy`

### Examples

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | ```
# Load in sample dataset d and drop rows with missing values
data(d)
d <- d[complete.cases(d), ]
# Compare mean BMI in control group vs. treatment group - table and figure
meanstable1 <- tabmeans(x = d$Group, y = d$BMI)
meansfig1 <- tabmeans(x = d$Group, y = d$BMI, fig = TRUE)
# Compare mean BMI by race - table and figure
meanstable2 <- tabmeans(x = d$Race, y = d$BMI)
meansfig2 <- tabmeans(x = d$Race, y = d$BMI, fig = TRUE)
# Compare mean baseline systolic BP across tertiles of BMI - table and figure
meanstable3 <- tabmeans(x = d$BMI, y = d$bp.1, yname = "Systolic BP", quantiles = 3)
meansfig3 <- tabmeans(x = d$BMI, y = d$bp.1, quantiles = 3, fig = TRUE,
yname = "Systolic BP", xname = "BMI Tertile")
# Create single table comparing mean BMI and mean age in control vs. treatment group
meanstable4 <- rbind(tabmeans(x = d$Group, y = d$BMI), tabmeans(x = d$Group, y = d$Age))
# An easier way to make the above table is to call the tabmulti function
meanstable5 <- tabmulti(dataset = d, xvarname = "Group", yvarnames = c("BMI", "Age"))
# meanstable4 and meanstable5 are equivalent
all(meanstable4 == meanstable5)
# To move meanstable1 into Word, run write.cb(meanstable1) to copy the table onto your
# clipboard. Paste into Word, highlight the table and go to Insert - Table - Convert Text
# to Table... OK. Alternatively, if you set print.html to TRUE, the function will write
# a html file named html.filename to your current working directory. You can open this
# file, copy the table, and paste it into Word.
``` |