icd9Comorbid: find comorbidities from ICD-9 codes.
In icd9: Tools for Working with ICD-9 Codes, and Finding Comorbidities

Description Usage Arguments Details Examples

RcppParallel approach to comorbidity assignment with OpenMP and vector of integers strategy. It is very fast, and most time is now spent setting up the data to be passed in.

This is the main function which extracts co-morbidities from a set of ICD-9 codes. This is when some trivial post-processing of the comorbidity data is done, e.g. renaming to human-friendly field names, and updating fields according to rules. The exact fields from the original mappings can be obtained using applyHierarchy = FALSE, but for comorbidity counting, Charlson Score, etc., the rules should be applied.

For Charlson/Deyo comorbidities, strictly speaking, there is no dropping of more e.g. uncomplicated DM if complicated DM exists, however, this is probaably useful, in general and is essential when calculating the Charlson score.

icd9ComorbidShortCpp(icd9df, icd9Mapping, visitId, icd9Field, threads = 8L,
  chunkSize = 256L, ompChunkSize = 1L, aggregate = TRUE)

icd9Comorbid(icd9df, icd9Mapping, visitId = NULL, icd9Field = NULL,
  isShort = icd9GuessIsShort(icd9df[1:100, icd9Field]),
  isShortMapping = icd9GuessIsShort(icd9Mapping), return.df = FALSE, ...)

icd9ComorbidShort(...)

icd9ComorbidAhrq(..., abbrevNames = TRUE, applyHierarchy = TRUE)

icd9ComorbidQuanDeyo(..., abbrevNames = TRUE, applyHierarchy = TRUE)

icd9ComorbidQuanElix(..., abbrevNames = TRUE, applyHierarchy = TRUE)

icd9ComorbidElix(..., abbrevNames = TRUE, applyHierarchy = TRUE)

icd9Comorbidities(...)

icd9ComorbiditiesAhrq(...)

icd9ComorbiditiesElixHauser(...)

icd9ComorbiditiesQuanDeyo(...)

icd9ComorbiditiesQuanElixhauser(...)

`icd9df`	data frame containing columns for visitId (which is the feault name), icd9 (default for the icd9 code), and maybe also a POA flag.
`icd9Mapping`	list (or name of a list if character vector of length one is given as argument) of the comorbidities with each top-level list item containing a vector of decimal ICD9 codes. This is in the form of a list, with the names of the items corresponding to the comorbidities (e.g. "HTN", or "diabetes") and the contents of each list item being a character vector of short-form (no decimal place but ideally zero left-padded) ICD-9 codes. No default: user should prefer to use the derivative functions, e.g. icd9ComorbidAhrq, since these also provide appropriate naming for the fields, and squashing the hierarchy (see `applyHierarchy` below)
`visitId`	The name of the column in the data frame which contains the patient or visit identifier. Typically this is the visit identifier, since patients come leave and enter hospital with different ICD-9 codes. It is a character vector of length one. If left empty, or `NULL`, then an attempt is made to guess which field has the ID for the patient encounter (not a patient ID, although this can of course be specified directly). The guesses proceed until a single match is made. Data frames may be wide with many matching fields, so to avoid false positives, anything but a single match is rejected. If there are no successful guesses, and `visitId` was not specified, then the first column of the data frame is used.
`icd9Field`	The column in the data frame which contains the ICD codes. This is a character vector of length one. If it is `NULL`, `icd9` will attempt to guess the column name, looking for progressively less likely possibilities until it matche a single column. Failing this, it will take the first column in the data frame. Specifying the column using this argument avoids the guesswork.
`aggregate`	single logical value, if /codeTRUE, then take (possible much) more time to aggregate out-of-sequence visit IDs in the icd9df data.frame. If this is `FALSE`, then each contiguous group of visit IDs will result in a row of comorbidities in the output data. If you know your visitIds are possible disordered, then use `TRUE`.
`isShort`	single logical value which determines whether the ICD-9 code provided is in short (TRUE) or decimal (FALSE) form. Where reasonable, this is guessed from the input data.
`isShortMapping`	Same as isShort, but applied to `icd9Mapping` instead of `icd9df`. All the codes in a mapping should be of the same type, i.e. short or decimal.
`...`	arguments passed to the corresponding function from the alias. E.g. all the arguments passed to `icd9ComorbiditiesAhrq` are passed on to `icd9ComorbidAhrq`
`abbrevNames`	single locical value that defaults to `TRUE`, in which case the ishorter human-readable names stored in e.g. `ahrqComorbidNamesAbbrev` are applied to the data frame column names.
`applyHierarchy`	single logical value that defaults to `TRUE`, in which case the hierarchy defined for the mapping is applied. E.g. in Elixhauser, you can't have uncomplicated and complicated diabetes both flagged.

There is a change in behavior from previous versions. The visitId column is (implicitly) sorted by using std::set container. Previously, the visitId output order was whatever R's aggregate produced.

The threading of the C++ can be controlled using e.g. option(icd9.threads = 4). If it is not set, the number of cores in the machine is used.

1
2
3

  pts <- data.frame(visitId = c("2", "1", "2", "3", "3"),
                   icd9 = c("39891", "40110", "09322", "41514", "39891"))
   icd9ComorbidShort(pts, ahrqComorbid) # visitId is now sorted