freq_table: Estimate Percents and 95 Percent Confidence Intervals in...

Description Usage Arguments Value References Examples

View source: R/freq_table.R

Description

The freq_table function produces one-way and two-way frequency tables for categorical variables. In addition to frequencies, the freq_table function displays percentages, and the standard errors and 95 confidence intervals of the percentages. For two-way tables only, freq_table also displays row (subgroup) percentages, standard errors, and 95 percent confidence intervals.

freq_table is intended to be used in a dplyr pipeline. Specifically, freq_table expects the x argument to be a grouped tibble created with dplyr's group_by function.

All standard errors are calculated as some version of: sqrt(proportion * (1 - proportion) / (n - 1))

For one-way tables, the default 95 percent confidence intervals displayed are logit transformed confidence intervals equivalent to those used by Stata. Additionally, freq_table will return Wald ("linear") confidence intervals if the argument to ci_type = "wald".

For two-way tables, freq_table returns logit transformed confidence intervals equivalent to those used by Stata.

Usage

1
2
freq_table(x, t_prob = 0.975, ci_type = "logit", output = "default",
  digits = 2, ...)

Arguments

x

A grouped tibble, i.e., class == "grouped_df".

For two-way tables, the count for each level of the variable in the first argument to group_by will be the denominator for row percentages and their 95 analysis is to compare percentages of some characteristic across two or more groups of interest, then the variable in the first argument to group_by should contain the groups of interest, and the variable in the second argument to group_by should contain the characteristic of interest.

t_prob

(1 - alpha / 2). Default value is 0.975, which corresponds to an alpha of 0.05. Used to calculate a critical value from Student's t distribution with n - 1 degrees of freedom.

ci_type

Selects the method used to estimate 95 percent confidence intervals. The default for one-way and two-way tables is logit transformed ("log"). For one-way tables only, ci_type can optionally calculate Wald ("linear") confidence intervals using the "wald" argument.

output

Options for this parameter are "default" and "all".

For one-way tables with default output, the count, overall n, percent and 95 percent confidence interval are returned. Using output = "all" also returns the standard error of the percent and the critical t-value.

For two-way tables with default output, the count, group n, overall n, row percent, and 95 percent confidence interval for the row percent are returned. Using output = "all" also returns the overall percent, standard error of the percent, 95 percent confidence interval for the overall percent, the standard error of the row percent, and the critical t-values.

digits

Round percentages and confidence intervals to digits. Default is 2.

...

Other parameters to be passed on.

Value

A tibble with class "freq_table_one_way" or "freq_table_two_way"

References

Agresti, A. (2012). Categorical Data Analysis (3rd ed.). Hoboken, NJ: Wiley.

SAS documentation: https://support.sas.com/documentation/cdl/en/statug/63347/HTML/default/viewer.htm#statug_surveyfreq_a0000000221.htm

Stata documentation: https://www.stata.com/manuals13/rproportion.pdf

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
library(tidyverse)
library(bfuncs)

data(mtcars)

# One-way frequency table with defaults

mtcars %>%
  group_by(am) %>%
  freq_table()

#> # A tibble: 2 x 7
#>     var   cat     n n_total percent   lcl   ucl
#>   <chr> <dbl> <int>   <int>   <dbl> <dbl> <dbl>
#> 1    am     0    19      32   59.38 40.94 75.50
#> 2    am     1    13      32   40.62 24.50 59.06

# Two-way frequency table with defaults

mtcars %>%
  group_by(am, cyl) %>%
  freq_table()

#> # A tibble: 6 x 10
#>   row_var row_cat col_var col_cat     n n_row n_total percent_row lcl_row ucl_row
#>     <chr>   <dbl>   <chr>   <dbl> <int> <int>   <int>       <dbl>   <dbl>   <dbl>
#> 1      am       0     cyl       4     3    19      32       15.79    4.78   41.20
#> 2      am       0     cyl       6     4    19      32       21.05    7.58   46.44
#> 3      am       0     cyl       8    12    19      32       63.16   38.76   82.28
#> 4      am       1     cyl       4     8    13      32       61.54   32.30   84.29
#> 5      am       1     cyl       6     3    13      32       23.08    6.91   54.82
#> 6      am       1     cyl       8     2    13      32       15.38    3.43   48.18

brad-cannell/my_functions documentation built on July 25, 2019, 4:29 p.m.