knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

In health science research, producing tables of baseline characteristics (commonly called Table 1) of patient cohorts is a routine procedure. Much of the analysis-specific data cleaning work is required in the steps leading up to table formation, but without a streamlined process for manipulating and organizing values of resulting calculations, considerable time can be spent pasting results together. The make_table_one function attempts to make this calculation, manipulation, and organization of summary values as streamlined as possible.

Overview

Pre-function data cleaning:

Using make_table_one:

The function make_table_one

Usage

make_table_one(df, grouping_var, num_vars = NULL, num_display = "PM",
  binary_cat_vars = NULL, multiple_cat_vars = NULL, cat_display = "CP", 
  subgroups_m = NULL, mean_vars_for_subgroups = NULL, 
  subgroups_c = NULL, count_vars_for_subgroups = NULL, 
  order_of_vars = NULL, 
  export_rtf = FALSE, rtf_filename = NULL,
  show_pval = TRUE, digits = 2)

Parameters

Pre-function data cleaning

Once you have imported your data, clean it appropriately:

baseline <- read_sas("my_sas_file.sas7bdat")

my_cleaned_baseline <- baseline %>%
  mutate(var1 = if_else(ex_var1, 0, 1),
     var2 = if_else(ex_var1, 0, 1),
     var3 = if_else(ex_var3 == 1, 1, 0)) %>%
  select(grouping_var, some_vector_of_other_variables)

Now make vectors of the names of your different types of variables. NOTE: If you plan to analyze a categorical variable by itself and then also use it for subgroup analysis, it is advised to copy and rename the columns of the categorical variables you wish to use as subgroups (ie. rename variable to something like 'variable_sub") so make_table_one can be properly sorted to reflect your specification in 'order_of_vars'. If the subgroup does not have a unique name, it will be sorted directly below the categorical variable.

num_vars <- c('num_var1', 'numvar2', ...)
binary_cat_vars <- c('bin_var1', 'bin_var2', ...)
multiple_cat_vars <- c('mc_var1', 'mc_var2', ...)

subgroups_m <- c('subgroup_m_1', subgroup_m_2', ...)
mean_vars <- c('calulate_mean_of_this_for_subgroup_m_1', 'calulate_mean_of_this_for_subgroup_m_2', ...)

subgroups_c <- c('subgroup_c_1', subgroup_c_2', ...)
count_vars <- c('calulate_count_of_this_for_subgroup_c_1', 'calulate_count_of_this_for_subgroup_c_2', ...))

Convert your categorical variables into factors (VERY IMPORTANT):

my_cleaned_baseline[binary_cat_vars] <- lapply(my_cleaned_baseline[binary_cat_vars], factor)
my_cleaned_baseline[multiple_cat_vars] <- lapply(my_cleaned_baseline[multiple_cat_vars], factor)
my_cleaned_baseline[subgroups_m] <- lapply(my_cleaned_baseline[subgroups_m], factor)
my_cleaned_baseline[subgroups_c] <- lapply(my_cleaned_baseline[subgroups_c], factor)

Create a vector containing the order of variables as you wish them to be displayed in your final table:

var_order <- c('var_for_first_row', 'var_for_second_row', ...)

Calling make_table_one

Once you have your data cleaned and vectors of variable names as above, substitute into make_table_one and save as an object for viewing within R.

table_1 <- make_table_one(df = my_cleaned_baseline, grouping_var = random_assignment_group, 
  num_vars = num_vars, num_display = 'PM', 
  binary_cat_vars = binary_cat_vars, multiple_cat_vars = multiple_cat_vars, cat_display = 'CP',
  subgroups_m = subgroups_m, mean_vars_for_subgroups = mean_vars, 
  subgroups_c = subgroups_c, count_vars_for_subgroups = count_vars,
  order_of_vars = var_order,
  export_rtf = TRUE, rtf_filename = 'my_location/my_filename.rtf', 
  show_pval = TRUE, digits = 2)

Notice that export_rtf = TRUE so an RTF will be generated at the location specified in rtf_filename = 'my_location/my_filename.rtf'.

Future Work

In the future, I hope to add the following:

Thanks for using make_table_one!



jjwillard/wfbmcphsr documentation built on May 14, 2019, 5:01 a.m.