Pediatric Complex Chronic Conditions

knitr::opts_chunk$set(collapse = TRUE, comment = "#>")


This vignette describes how the pccc package generates the Complex Chronic Condition Categories (CCC) from ICD-9 and ICD-10 codes.

A CCC is “any medical condition that can be reasonably expected to last at least 12 months (unless death intervenes) and to involve either several different organ systems or 1 organ system severely enough to require specialty pediatric care and probably some period of hospitalization in a tertiary care center." The categorization is based on the work of Feudtner et al. (2000 & 2014), as referenced below.

A supplemental reference document showing the lists of codes for each category was published as a supplement to Feudtner et al. (2014) and we have made it available as part of the pccc package. After installing the package, you can find the file on your system with the below system.file call. Open the file with your preferred/available program for .docx files (Word, etc.).

system.file("pccc_references/Categories_of_CCCv2_and_Corresponding_ICD.docx", package = "pccc")

To evaluate the code chunks in this example you will need to load the following R packages.


Logic employed

There are 12 total categories of CCCs used in this package. The first group of 10 are mutually exclusive - only one of them can be derived from a single ICD code:

The last 2 can be be selected in addition to the above codes - for example, one ICD code could generate CCC categorization as both Gastrointestinal and Technology Dependency:

To see actual specific ICD codes by category, see pccc-icd-codes.

Generating CCC categories from ICD codes

The ccc function is the workhorse here. Simply put, a user will provide ICD codes as strings and ccc will return CCC categories. CCC codes for ICD-9-CM are matched on substrings and ICD 10 codes are matched on full codes, but the ccc function uses the same "starts with substring" matching logic for both, except in a few cases described in the next paragraph.

Substring matching exceptions

Some datasets may contain different degrees of specificity of ICD-9-CM codes, which can lead to issues with substring matching for certain codes. For example, consider a patient with Congenital hereditary muscular dystrophy. The least specific ICD-9-CM code for Muscular dystropy is 359, which is a CCC code. The exact ICD-9-CM code specifying Congenital hereditary muscular dystrophy is 3590. Even when describing the same patient, one dataset may contain the 359 code while another dataset may contain the 3590 code. If we use substring matching logic above and match on 359, we would capture the patient in both datasets. However, we would also capture non-CCC diagnoses like 3594, Toxic myopathy. If we use substring matching logic and match on 3590, we would only capture the patient in the dataset with more specific ICD-9-CM codes. We address this problem by exact matching for less specific codes (e.g., the code 359 will match only if the dataset contains the 3-digit code 359) and substring matching for more specific codes (e.g., code 3590 will match any code beginning with 3590). This approach improves the sensitivity of detecting CCCs in datasets with less specific codes (e.g. 359) and also reduces misclassification errors in datasets with more specific codes (e.g. 3590).

We have listed these exact match exceptions under their corresponding CCC category in the pccc-icd-codes description.

Preparing ICD-9-CM and ICD-10-CM codes for analysis using the PCCC package

Users of the pccc package will need to pre-process the ICD-9 and ICD-10 codes in their data so that the strings are formatted in the way that the pccc package will recognize them.

Specific rules to format ICD Codes correctly:

Potential issues with improperly formatted ICD codes:

Users of PCCC may find the R Package ICD useful.

PCCC Examples

To illustrate the how the input formatting impacts the identification of a CCC, consider the data data.frame named dat below. These data have information about three patients (A-C). Each subject has the same ICD-9-CM diagnosis code (e.g. Hypertrophic obstructive cardiomyopathy, ICD-9-CM 425.11, which should be sent as 4251) and the same ICD-9-CM procedure code (e.g. Heart transplantation, ICD-9-CM 37.51, which should be sent as 3751), but each input is formatted differently. Based on the ICD-9-CM diagnosis code, the ccc function will only identify subject A as having a CCC. Based on the ICD-9-CM procedure code, the ccc function will only identify subject B as having a CCC and will also flag the Transplantation category.

Basic Example

dat <- data.frame(ids = c("A", "B", "C"), 
                  dxs = c("4251", "425.1", "425.1"), 
                  procs = c("37.51", "3751", "37.51"))
    id = ids, 
    dx_cols = dxs, 
    pc_cols = procs, 
    icdv = 9)

Extended Example

This example used a tool developed by Seth Russell (available at icd_file_generator) to create a sample data file for ICD-9-CM and ICD-10-CM. The generated data files contain randomly generated ICD codes for 1,000 patients and is comprised of 10 columns of diagnosis codes (d_cols), 10 columns of procedure codes (p_cols), and 10 columns of other data (g_cols).

Sample of how ICD-9-CM test file was generated:

pccc_icd9_dataset <- generate_sample(
  v = 9,
  n_rows = 10000,
  d_cols = 10,
  p_cols = 10,
  g_cols = 10

save(pccc_icd9_dataset, file="pccc_icd9_dataset.rda")

Example using sample patient data set:


ccc_result <-
    ccc(pccc::pccc_icd9_dataset[, c(1:21)], # get id, dx, and pc columns
        id      = id,
        dx_cols = dplyr::starts_with("dx"),
        pc_cols = dplyr::starts_with("pc"),
        icdv    = 09)

# review results

# view number of patients with each CCC
sum_results <- dplyr::summarize_at(ccc_result, vars(-id), sum) %>%

# view percent of total population with each CCC
dplyr::summarize_at(ccc_result, vars(-id), mean) %>%


Try the pccc package in your browser

Any scripts or data that you put into this service are public.

pccc documentation built on July 1, 2020, 11:41 p.m.