rc_pool: Pools columns for aggregated analysis of fields with like...
In chillywings/rctools: Tools for REDCap API and data manipulation

rc_pool

R Documentation

Pools columns for aggregated analysis of fields with like data

Description

For each variable root provided, all column names in the record_data containing that root will be pooled into a single column and appended to the end of the dataframe. To see which columns have been pooled, run the command attributes([YOUR_DATA_FRAME])$pooled_vars on the returned dataframe.

Additionally, exact (i.e. full name) matching can be performed with the fields_list argument. Fields provided in this argument will be searched for in all columns. If both arguments are provided, fields_list will be applied first.

Furthermore, if the columns selected to be pooled contain more than one data point per row, the first data point will be used. In this case, pooling is likely inappropriate and the pooled columns should be reviewed. However, if for some reason pooling is still desirable and all data points should be kept, use make_repeat = TRUE to convert the pooled variables into repeats.

Usage

rc_pool(
  record_data,
  var_roots = NULL,
  fields_list = NULL,
  make_repeat = TRUE,
  id_field = getOption("redcap_bundle")$id_field
)

Arguments

`record_data`	Dataframe. Records data export from REDCap. For the purposes of this function, only quantitative data will be kept.
`var_roots`	Character. Vector of strings to search for within column names of record_data. For each variable root provided, all column names containing the root will be pooled into a single column. Regular expressions may be used.
`fields_list`	List. A list in the format `list(new_column = c("old","column","names"))`. Unlike `var_roots`, the column names provided here will be matched exactly. In addition, if both `var_roots` and `fields_list` are provided, `fields_list` will be applied first.
`make_repeat`	Logical. Determines whether the pooled columns will be converted into repeat instruments. Default is `TRUE`. This option is useful for when there are same-row data points within columns to be pooled. In the future, this will be implemented automatically on an as-needed basis.
`id_field`	Character. Field name corresponding to the 'record_id' field.

Details

The intention of this function is to correct for inefficient REDCap project design where the same data measurement has been assigned to multiple variables. For example, if the variables "visit_1_weight" and "visit_2_weight" have been used to collect weight at different visits rather than re-using the same variable, they can be pooled into a single column using the var_root "weight". This is often desirable for analysis.