icd9Count: count ICD codes or comorbidities for each patient
In jackwasey/icd9: Tools for Working with ICD-9 Codes, and Finding Comorbidities

Description Usage Arguments Value Examples

icd9Count takes a data frame with a column for visitId and another for ICD-9 code, and returns the number of distinct codes for each patient.

The visitId field is typically the first column. If there is no column called visitId and visitId is not specified, the first column is used.

icd9CountComorbidBin differs from the other counting functions in that it counts _comorbidities_, not individual diagnoses. It accepts any data frame with either logicals or zero/non-zero contents, with a single column for visitId. No checks are made to see whether visitId is duplicated.

For icd9Count, it is assumed that all the columns apart from visitId represent actual or possible ICD-9 codes. Duplicate visitIds are repeated as given and aggregated.

icd9Count(x, visitId = NULL, return.df = FALSE)

icd9CountComorbidBin(x, visitId = NULL, return.df = FALSE)

icd9CountWide(x, visitId = NULL, return.df = FALSE, aggregate = FALSE)

`x`	data frame with one row per patient, and a true/false or 1/0 flag for each column. By default, the first column is the patient identifier and is not counted. If `visitId` is not specified, the first column is used.
`visitId`	The name of the column in the data frame which contains the patient or visit identifier. Typically this is the visit identifier, since patients come leave and enter hospital with different ICD-9 codes. It is a character vector of length one. If left empty, or `NULL`, then an attempt is made to guess which field has the ID for the patient encounter (not a patient ID, although this can of course be specified directly). The guesses proceed until a single match is made. Data frames may be wide with many matching fields, so to avoid false positives, anything but a single match is rejected. If there are no successful guesses, and `visitId` was not specified, then the first column of the data frame is used.
`return.df`	single logical, if `TRUE`, return the result as a data frame with the first column being the `visitId`, and the second being the count. If `visitId` was a factor or named differently in the input, this is preserved.
`aggregate,`	single logical, default is FALSE. If TRUE, the length (or rows) of the output will no longer match the input, but duplicate visitIds will be counted together.

vector of the count of comorbidities for each patient. This is sometimes used as a metric of comorbidity load, instead of, or inaddition to metrics like the Charlson Comorbidity Index (aka Charlson Score)

  mydf <- data.frame(visitId = c("r", "r", "s"),
                   icd9 = c("441", "412.93", "044.9"))
  icd9Count(mydf, return.df = TRUE)
  icd9Count(mydf)

  cmb <- icd9ComorbidQuanDeyo(mydf, isShort = FALSE, return.df = TRUE)
  icd9CountComorbidBin(cmb)

  wide <- data.frame(visitId = c("r", "s", "t"),
                   icd9_1 = c("0011", "441", "456"),
                   icd9_2 = c(NA, "442", NA),
                   icd9_3 = c(NA, NA, "510"))
  icd9CountWide(wide)
  # or:
  library(magrittr)
  wide %>% icd9WideToLong %>% icd9Count