e_split_list_columns_into_indicator_columns: Split a (set of) item-listing columns into indicator columns
In erikerhardt/erikmisc: Erik Erhardt's miscellaneous functions for solving complex data analysis workflows

View source: R/e_split_list_columns_into_indicator_columns.R

e_split_list_columns_into_indicator_columns

R Documentation

Split a (set of) item-listing columns into indicator columns

Description

Commonly used for comorbidities or prescription lists within a single (or multiple) column(s). Takes a column where items are separated by puncutation (,.;/|) and creates separate columns with indicators. Can treat counts of items >1 as 1 to simplify summary tables (for example, multiple items coded as "other").

Usage

e_split_list_columns_into_indicator_columns(
  dat_this,
  var_names_items = NULL,
  item_delimiters = ",.;/|",
  code_other_below_freq = 5,
  label_other = "other",
  indicator_col_prefix = "item_",
  sw_data_or_summary = c("data", "summary")[1],
  sw_replace_GT1_with_1 = FALSE,
  sw_print_unique = TRUE
)

Arguments

`dat_this`	entire data.frame or tibble, will return with additional indicator columns
`var_names_items`	column names with lists of items
`item_delimiters`	delimiter(s) that separate items within a single column
`code_other_below_freq`	replace item name with `label_other` if total frequency for an item is less than this value
`label_other`	label for the "other" category
`indicator_col_prefix`	prefix for the new indicator columns
`sw_data_or_summary`	return data with indicator columns or return summary tables of frequencies of items
`sw_replace_GT1_with_1`	T/F, to replace "greater than 1" counts with an indicator of 1 (to interpret as "at least 1")
`sw_print_unique`	T/F, print list of items before and after replacing with "other"

Value

dat_this from sw_data_or_summary, either the data with additional indicator columns; or a list of summary tables of frequencies of items

Examples

dat_ex <-
  dplyr::tibble(
    col1 =
      c(
        NA, "", "a", "A, B  ,C", "b", "D. c ;    d"
      , "x  ;Y", "ab/0|J;1;1", 2, "other", 1
      , "a a a", "1,a a a", "2,a a a"
      )
  , col2 = LETTERS[1:length(col1)]
  ) |>
  dplyr::mutate(
    ID = 1:dplyr::n()
  ) |>
  dplyr::select(
    ID
  , tidyselect::everything()
  )
dat_ex |> print(n = Inf)

# return data
dat_ex_out <-
  e_split_list_columns_into_indicator_columns(
    dat_this              = dat_ex
  , var_names_items       = c("col1", "col2")
  , item_delimiters       = ",.;/|"
  , code_other_below_freq = 2
  , label_other           = "other"
  , indicator_col_prefix  = "item_"
  , sw_data_or_summary    = "data"
  , sw_replace_GT1_with_1 = FALSE
  , sw_print_unique       = TRUE
  )
dat_ex_out |> print(n = Inf)

# return summary
dat_ex_sum <-
  e_split_list_columns_into_indicator_columns(
    dat_this              = dat_ex
  , var_names_items       = c("col1", "col2")
  , item_delimiters       = ",.;/|"
  , code_other_below_freq = 2
  , label_other           = "other"
  , indicator_col_prefix  = "item_"
  , sw_data_or_summary    = "summary"
  , sw_replace_GT1_with_1 = FALSE
  , sw_print_unique       = FALSE
  )
dat_ex_sum

erikerhardt/erikmisc documentation built on April 17, 2025, 10:48 a.m.

erikerhardt/erikmisc index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

erikerhardt/erikmisc
Erik Erhardt's miscellaneous functions for solving complex data analysis workflows

e_split_list_columns_into_indicator_columns: Split a (set of) item-listing columns into indicator columns
In erikerhardt/erikmisc: Erik Erhardt's miscellaneous functions for solving complex data analysis workflows

Split a (set of) item-listing columns into indicator columns

Description

Usage

Arguments

Value

Examples

Related to e_split_list_columns_into_indicator_columns in erikerhardt/erikmisc...

R Package Documentation

Browse R Packages

We want your feedback!

erikerhardt/erikmisc Erik Erhardt's miscellaneous functions for solving complex data analysis workflows

e_split_list_columns_into_indicator_columns: Split a (set of) item-listing columns into indicator columns In erikerhardt/erikmisc: Erik Erhardt's miscellaneous functions for solving complex data analysis workflows

Split a (set of) item-listing columns into indicator columns

Description

Usage

Arguments

Value

Examples

Related to e_split_list_columns_into_indicator_columns in erikerhardt/erikmisc...

R Package Documentation

Browse R Packages

We want your feedback!

erikerhardt/erikmisc
Erik Erhardt's miscellaneous functions for solving complex data analysis workflows

e_split_list_columns_into_indicator_columns: Split a (set of) item-listing columns into indicator columns
In erikerhardt/erikmisc: Erik Erhardt's miscellaneous functions for solving complex data analysis workflows