tabfreq: Generate Frequency Tables for Statistical Reports
In tab: Functions for Creating Summary Tables for Statistical Reports

Description Usage Arguments Details Value Note Author(s) References See Also Examples

This function creates an I-by-J frequency table and summarizes the results in a clean table for a statistical report.

tabfreq(x, y, latex = FALSE, xlevels = NULL, yname = NULL, ylevels = NULL,
        quantiles = NULL, quantile.vals = FALSE, cell = "n", parenth = NULL,
        text.label = NULL, parenth.sep = "-", test = "chi", decimals = NULL,
        p.include = TRUE, p.decimals = c(2, 3), p.cuts = 0.01,
        p.lowerbound = 0.001, p.leading0 = TRUE, p.avoid1 = FALSE,
        overall.column = TRUE, n.column = FALSE, n.headings = TRUE,
        compress = FALSE, compress.val = NULL, bold.colnames = TRUE,
        bold.varnames = FALSE, bold.varlevels = FALSE,
        variable.colname = "Variable", print.html = FALSE,
        html.filename = "table1.html")

`x`	Vector of values indicating group membership for columns of IxJ table.
`y`	Vector of values indicating group membership for rows of IxJ table.
`latex`	If `TRUE`, object returned is formatted for printing in LaTeX using `xtable` [1]; if `FALSE`, formatted for copy-and-pasting from RStudio into a word processor.
`xlevels`	Optional character vector to label the levels of `x`, used in the column headings. If unspecified, the function uses the values that `x` takes on.
`yname`	Optional label for the `y` (row) variable. If unspecified, variable name of `y` is used.
`ylevels`	Optional character vector to label the levels of `y`. If unspecified, the function uses the values that `y` takes on. Note that levels of `y` will be listed in the order that they appear when you run `table(y, x)`.
`quantiles`	If specified, function compares distribution of the `y` variable across quantiles of the `x` variable. For example, if `x` contains continuous BMI values and `y` is race, setting `quantiles = 3` would result in the distribution of race being compared across tertiles of BMI.
`quantile.vals`	If `TRUE`, labels for `x` show quantile number and corresponding range of the `x` variable, e.g. Q1 [0.00, 0.25). If `FALSE`, labels for quantiles just show quantile number, e.g. Q1. Only used if `xlevels` is not specified.
`cell`	Controls what value is placed in each cell of the table. Possible choices are `"n"` for counts, `"tot.percent"` for table percentage, `"col.percent"` for column percentage, `"row.percent"` for row percentage, `"tot.prop"` for table proportion, `"col.prop"` for column proportion, `"row.prop"` for row proportion, `"n/totn"` for count/total counts, `"n/coln"` for count/column count, and `"n/rown"` for count/row count.
`parenth`	Controls what values (if any) are placed in parentheses after the values in each cell. By default, if `cell` is `"n"`, `"n/totn"`, `"n/coln"`, or `"n/rown"` then the corresponding percentage is shown in parentheses; if `cell` is `"tot.percent"`, `"col.percent"`, `"row.percent"`, `"tot.prop"`, `"col.prop"`, or `"row.prop"` then a 95% confidence interval for the requested percentage of proportion is shown in parentheses. Possible values are `"none"`, `"se"` for standard error of requested percentage or proportion, `"ci"` for 95% confidence interval for requested percentage of proportion, and `"tot.percent"`, `"col.percent"`, `"row.percent"`, `"tot.prop"`, `"col.prop"`, and `"row.prop"` for various percentages and proportions.
`text.label`	Optional text to put after the `y` variable name, identifying what cell values and parentheses indicate in the table. If unspecified, function uses default labels based on `cell` and `parenth`. Set to `"none"` for no text labels.
`parenth.sep`	Optional character specifying the separator between lower and upper bound of confidence interval (when requested). Usually either `"-"` or `", "` depending on user preference.
`test`	Controls test for association between `x` and `y`. Use `"chi"` for Pearson's chi-squared test, which is valid only in large samples; `"fisher"` for Fisher's exact test, which is valid in small or large samples; `"z"` for z test without continuity correction; or `"z.continuity"` for z test with continuity correction. `"z"` and `"z.continuity"` can only be used if `x` and `y` are binary.
`decimals`	Number of decimal places for values in table (no decimals are used for counts). If unspecified, function uses 1 decimal for percentages and 3 decimals for proportions.
`p.include`	If `FALSE`, statistical test is not performed and p-value is not returned.
`p.decimals`	Number of decimal places for p-values. If a vector is provided rather than a single value, number of decimal places will depend on what range the p-value l ies in. See `p.cuts`.
`p.cuts`	Cut-point(s) to control number of decimal places used for p-values. For example, by default `p.cuts = 0.1` and `p.decimals = c(2, 3)`. This means that p-values in the range [0.1, 1] will be printed to two decimal places, while p-values in the range [0, 0.1) will be printed to three decimal places.
`p.lowerbound`	Controls cut-point at which p-values are no longer printed as their value, but rather <lowerbound. For example, by default `p.lowerbound = 0.001`. Under this setting, p-values less than 0.001 are printed as `<0.001`.
`p.leading0`	If `TRUE`, p-values are printed with 0 before decimal place; if `FALSE`, the leading 0 is omitted.
`p.avoid1`	If `TRUE`, p-values rounded to 1 are not printed as 1, but as `>0.99` (or similarly depending on `p.decimals` and `p.cuts`).
`overall.column`	If `FALSE`, column showing distribution of `y` in full sample is suppressed.
`n.column`	If `TRUE`, the table will have a column for sample size.
`n.headings`	If `TRUE`, the table will indicate the sample size overall and in each group in parentheses after the column headings.
`compress`	If `y` has only two levels, setting compress to `TRUE` will produce a single row rather than two rows. For example, if `y` is sex with 0 for female, 1 for male, and `cell = "n"` and `parenth = "col.pecent"`, setting `compress = TRUE` will return a table with `n (percent)` for males only. If `FALSE`, the table would show `n (percent)` for both males and females, which is somewhat redundant.
`compress.val`	When `x` and `y` are both binary and `compress = TRUE`, `compress.val` can be used to specify which level of the `y` variable should be shown. For example, if `x` is sex and `y` is obesity status with levels `"Obese"` and `"Not Obese"`, setting `compress = TRUE` and `compress.val = "Not Obese"` would result in the table comparing the proportions of subjects that are not obese by sex.
`bold.colnames`	If `TRUE`, column headings are printed in bold font. Only applies if `latex = TRUE`.
`bold.varnames`	If `TRUE`, variable name in the first column of the table is printed in bold font. Only applies if `latex = TRUE`.
`bold.varlevels`	If `TRUE`, levels of the y variable are printed in bold font. Only applies if `latex = TRUE`.
`variable.colname`	Character string with desired heading for first column of table, which shows the `y` variable name and levels.
`print.html`	If `TRUE`, function prints a .html file to the current working directory.
`html.filename`	Character string indicating the name of the .html file that gets printed if `print.html = TRUE`.

A character matrix with the requested frequency table. If latex = TRUE, the character matrix will be formatted for inserting into a Markdown/Sweave/knitr report using xtable [1].

If you wish to paste your tables into Word, you can use either of these approaches:

1. Use the write.cb function in the Kmisc package [2]. If your table is stored in a character matrix named table1, use write.cb(table1) to copy the table to your clipboard. Paste the result into Word, then highlight the text and go to Insert - Table - Convert Text to Table... OK.

2. Set print.html = TRUE. This will result in a .html file writing to your current working directory. When you open this file, you will see a nice looking table that you can copy and paste into Word. You can control the name of this file with html.filename.

If you wish to use LaTeX, R Markdown, knitr, Sweave, etc., set latex = TRUE and then use xtable [1]. You may have to set sanitize.text.function = identity when calling print.xtable.

If you have suggestions for additional options or features, or if you would like some help using any function in tab, please e-mail me at vandomed@gmail.com. Thanks!

Dane R. Van Domelen

1. Dahl DB (2013). xtable: Export tables to LaTeX or HTML. R package version 1.7-1, https://cran.r-project.org/package=xtable.

2. Kevin Ushey (2013). Kmisc: Kevin Miscellaneous. R package version 0.5.0. https://CRAN.R-project.org/package=Kmisc.

Acknowledgment: This material is based upon work supported by the National Science Foundation Graduate Research Fellowship under Grant No. DGE-0940903.

tabmeans
tabmedians
tabmulti
tabglm
tabcox
tabgee
tabfreq.svy
tabmeans.svy
tabmedians.svy
tabmulti.svy
tabglm.svy

# Load in sample dataset d and drop rows with missing values
data(d)
d <- d[complete.cases(d), ]

# Compare sex distribution by group, with group as column variable
freqtable1 <- tabfreq(x = d$Group, y = d$Sex)

# Same comparison, but compress table to show Female row only, show percent (SE)
# rather than n (percent), and suppress (n = ) from column headings
freqtable2 <- tabfreq(x = d$Group, y = d$Sex, compress = TRUE,
                      compress.val = "Female", cell = "col.percent",
                      parenth = "se", n.headings = FALSE)

# Compare sex distribution by race, suppressing (n = ) from column headings and
# showing percent (95% CI) rather than n (percent)
freqtable3 <- tabfreq(x = d$Race, y = d$Sex, n.headings = FALSE,
                      cell = "col.percent")

# Use rbind to create single table comparing sex and race in control vs.
# treatment group
freqtable4 <- rbind(tabfreq(x = d$Group, y = d$Sex),
                    tabfreq(x = d$Group, y = d$Race))

# A (usually) faster way to make the above table is to call the the tabmulti
# function
freqtable5 <- tabmulti(dataset = d, xvarname = "Group",
                       yvarnames = c("Sex", "Race"))

# freqtable4 and freqtable5 are equivalent
all(freqtable4 == freqtable5)

Pearson's chi-square test was used to test whether the distribution of Sex differed across groups.
Pearson's chi-square test was used to test whether the distribution of Sex differed across groups.
Pearson's chi-square test was used to test whether the distribution of Sex differed across groups.
Pearson's chi-square test was used to test whether the distribution of Sex differed across groups.
Pearson's chi-square test was used to test whether the distribution of Race differed across groups.
Pearson's chi-square test was used to test whether the distribution of Sex differed across groups.
Pearson's chi-square test was used to test whether the distribution of Race differed across groups.
[1] TRUE