View source: R/e_split_list_columns_into_indicator_columns.R
e_split_list_columns_into_indicator_columns | R Documentation |
Commonly used for comorbidities or prescription lists within a single (or multiple) column(s). Takes a column where items are separated by puncutation (,.;/|) and creates separate columns with indicators. Can treat counts of items >1 as 1 to simplify summary tables (for example, multiple items coded as "other").
e_split_list_columns_into_indicator_columns(
dat_this,
var_names_items = NULL,
item_delimiters = ",.;/|",
code_other_below_freq = 5,
label_other = "other",
indicator_col_prefix = "item_",
sw_data_or_summary = c("data", "summary")[1],
sw_replace_GT1_with_1 = FALSE,
sw_print_unique = TRUE
)
dat_this |
entire data.frame or tibble, will return with additional indicator columns |
var_names_items |
column names with lists of items |
item_delimiters |
delimiter(s) that separate items within a single column |
code_other_below_freq |
replace item name with |
label_other |
label for the "other" category |
indicator_col_prefix |
prefix for the new indicator columns |
sw_data_or_summary |
return data with indicator columns or return summary tables of frequencies of items |
sw_replace_GT1_with_1 |
T/F, to replace "greater than 1" counts with an indicator of 1 (to interpret as "at least 1") |
sw_print_unique |
T/F, print list of items before and after replacing with "other" |
dat_this from sw_data_or_summary
, either the data with additional indicator columns; or a list of summary tables of frequencies of items
dat_ex <-
dplyr::tibble(
col1 =
c(
NA, "", "a", "A, B ,C", "b", "D. c ; d"
, "x ;Y", "ab/0|J;1;1", 2, "other", 1
, "a a a", "1,a a a", "2,a a a"
)
, col2 = LETTERS[1:length(col1)]
) |>
dplyr::mutate(
ID = 1:dplyr::n()
) |>
dplyr::select(
ID
, tidyselect::everything()
)
dat_ex |> print(n = Inf)
# return data
dat_ex_out <-
e_split_list_columns_into_indicator_columns(
dat_this = dat_ex
, var_names_items = c("col1", "col2")
, item_delimiters = ",.;/|"
, code_other_below_freq = 2
, label_other = "other"
, indicator_col_prefix = "item_"
, sw_data_or_summary = "data"
, sw_replace_GT1_with_1 = FALSE
, sw_print_unique = TRUE
)
dat_ex_out |> print(n = Inf)
# return summary
dat_ex_sum <-
e_split_list_columns_into_indicator_columns(
dat_this = dat_ex
, var_names_items = c("col1", "col2")
, item_delimiters = ",.;/|"
, code_other_below_freq = 2
, label_other = "other"
, indicator_col_prefix = "item_"
, sw_data_or_summary = "summary"
, sw_replace_GT1_with_1 = FALSE
, sw_print_unique = FALSE
)
dat_ex_sum
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.