categorize | R Documentation |
This is the main function of the package, which relies of a triad of objects:
(1) data
with unit id:s and possible dates of interest;
(2) codedata
for corresponding
units and with optional dates of interest and;
(3) a classification scheme (classcodes
object; cc
) with regular
expressions to identify and categorize relevant codes.
The function combines the three underlying steps performed by
codify()
, classify()
and index()
.
Relevant arguments are passed to those functions by
codify_args
and cc_args
.
categorize(x, ...) ## S3 method for class 'data.frame' categorize(x, ...) ## S3 method for class 'tbl_df' categorize(x, ...) ## S3 method for class 'data.table' categorize(x, ..., codedata, id, code, codify_args = list()) ## S3 method for class 'codified' categorize( x, ..., cc, index = NULL, cc_args = list(), check.names = TRUE, .data_cols = NULL )
x |
data set with mandatory character id column
(identified by argument |
... |
arguments passed between methods |
codedata |
external code data with mandatory character id column
(identified by |
id |
name of unique character id column found in
both |
code |
name of code column in |
codify_args |
Lists of named arguments passed to |
cc |
|
index |
Argument passed to |
cc_args |
List with named arguments passed to
|
check.names |
Column names are based on |
.data_cols |
used internally |
Object of the same class as x
with additional logical columns
indicating membership of groups identified by the
classcodes
object (the cc
argument).
Numeric indices are also included if requested by the index
argument.
Other verbs:
classify()
,
codify()
,
index_fun
# For this example, 1 core would suffice: old_threads <- data.table::getDTthreads() data.table::setDTthreads(1) # For some patient data (ex_people) and related hospital visit code data # with ICD 10-codes (ex_icd10), add the Elixhauser comorbidity # conditions based on all registered ICD10-codes categorize( x = ex_people, codedata = ex_icd10, cc = "elixhauser", id = "name", code = "icd10" ) # Add Charlson categories and two versions of a calculated index # ("quan_original" and "quan_updated"). categorize( x = ex_people, codedata = ex_icd10, cc = "charlson", id = "name", code = "icd10", index = c("quan_original", "quan_updated") ) # Only include recent hospital visits within 30 days before surgery, categorize( x = ex_people, codedata = ex_icd10, cc = "charlson", id = "name", code = "icd10", index = c("quan_original", "quan_updated"), codify_args = list( date = "surgery", days = c(-30, -1), code_date = "admission" ) ) # Multiple versions ------------------------------------------------------- # We can compare categorization by according to Quan et al. (2005); "icd10", # and Armitage et al. (2010); "icd10_rcs" (see `?charlson`) # Note the use of `tech_names = TRUE` to distinguish the column names from the # two versions. # We first specify some common settings ... ind <- c("quan_original", "quan_updated") cd <- list(date = "surgery", days = c(-30, -1), code_date = "admission") # ... we then categorize once with "icd10" as the default regular expression ... categorize( x = ex_people, codedata = ex_icd10, cc = "charlson", id = "name", code = "icd10", index = ind, codify_args = cd, cc_args = list(tech_names = TRUE) ) %>% # .. and once more with `regex = "icd10_rcs"` categorize( codedata = ex_icd10, cc = "charlson", id = "name", code = "icd10", index = ind, codify_args = cd, cc_args = list(regex = "icd10_rcs", tech_names = TRUE) ) # column names ------------------------------------------------------------ # Default column names are based on row names from corresponding classcodes # object but are modified to be syntactically correct. default <- categorize(ex_people, codedata = ex_icd10, cc = "elixhauser", id = "name", code = "icd10") # Set `check.names = FALSE` to retain original names: original <- categorize( ex_people, codedata = ex_icd10, cc = "elixhauser", id = "name", code = "icd10", check.names = FALSE ) # Or use `tech_names = TRUE` for informative but long names (use case above) tech <- categorize(ex_people, codedata = ex_icd10, cc = "elixhauser", id = "name", code = "icd10", cc_args = list(tech_names = TRUE) ) # Compare tibble::tibble(names(default), names(original), names(tech)) # Go back to original number of threads data.table::setDTthreads(old_threads)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.