bivariate_compare: Create publication-style table across one categorical...
In describedata: Miscellaneous Descriptive Functions

Description Usage Arguments Details Value Examples

Descriptive statistics for categorical variables as well as normally and non-normally distributed continuous variables, split across levels of a categorical variable. Depending on the variable type, an appropriate statistical test is used to assess differences across levels of the comparison variable.

bivariate_compare(df, compare, normal_vars = NULL,
  non_normal_vars = NULL, cat_vars = NULL, display_round = 2,
  p = TRUE, p_round = 4, include_na = FALSE, col_n = TRUE,
  cont_n = FALSE, all_cont_mean = FALSE, all_cont_median = FALSE,
  iqr = TRUE, fisher = FALSE, workspace = NULL, var_order = NULL,
  var_label_df = NULL)

`df`	A data.frame or tibble.
`compare`	Discrete variable. Separate statistics will be produced for each level, with statistical tests across levels. Must be quoted.
`normal_vars`	Character vector of normally distributed continuous variables that will be included in the descriptive table.
`non_normal_vars`	Character vector of non-normally distributed continuous variables that will be included in the descriptive table.
`cat_vars`	Character vector of categorical variables that will be included in the descriptive table.
`display_round`	Number of decimal places displayed values should be rounded to
`p`	Logical. Should p-values be calculated and displayed? Default `TRUE`.
`p_round`	Number of decimal places p-values should be rounded to.
`include_na`	Logical. Should `NA` values be included in the table and accompanying statistical tests? Default `FALSE`.
`col_n`	Logical. Should the total number of observations be displayed for each column? Default `TRUE`.
`cont_n`	Logical. Display sample n for continuous variables in the table. Default `FALSE`.
`all_cont_mean`	Logical. Display mean (sd) for all continuous variables. Default `FALSE` results in mean (sd) for normally distributed variables and median (IQR) for non-normally distributed variables. Must be `FALSE` if `all_cont_median == TRUE`.
`all_cont_median`	Logical. Display median (sd) for all continuous variables. Default `FALSE` results in mean (sd) for normally distributed variables and median (IQR) for non-normally distributed variables. Must be `FALSE` if `all_cont_mean == TRUE`.
`iqr`	Logical. If the median is displayed for a continuous variable, should interquartile range be displayed as well (`TRUE`), or should the values for the 25th and 75th percentiles be displayed (`FALSE`)? Default `TRUE`
`fisher`	Logical. Should Fisher's exact test be used for categorical variables? Default `FALSE`. Ignored if `p == FALSE`.
`workspace`	Numeric variable indicating the workspace to be used for Fisher's exact test. If `NULL`, the default, the default value of `2e5` is used. Ignored if `fisher == FALSE`.
`var_order`	Character vector listing the variable names in the order results should be displayed. If `NULL`, the default, continuous variables are displayed first, followed by categorical variables.
`var_label_df`	A data.frame or tibble with columns "variable" and "label" that contains display labels for each variable specified in `normal_vars`, `non_normal_vars`, and `cat_vars`.

Statistical differences between normally distributed continuous variables are assessed using aov(), differences in non-normally distributed variables are assessed using kruskal.test(), and differences in categorical variables are assessed using chisq.test() by default, with a user option for fisher.test() instead.

A data.frame with columns label, overall, a column for each level of compare, and p.value. For normal_vars, mean (SD) is displayed, for non_normal_vars median (IQR) is displayed, and for cat_vars n (percent) is displayed. For p values on continuous variables, a superscript 'a' denotes the Kruskal-Wallis test was used

1
2
3

bivariate_compare(iris, compare = "Species", normal_vars = c("Sepal.Length", "Sepal.Width"))

bivariate_compare(mtcars, compare = "cyl", non_normal_vars = "mpg")