Codelist diagnostics"

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>", message = FALSE, warning = FALSE,
  fig.width = 7
)

library(CDMConnector)
if (Sys.getenv("EUNOMIA_DATA_FOLDER") == "") Sys.setenv("EUNOMIA_DATA_FOLDER" = tempdir())
if (!dir.exists(Sys.getenv("EUNOMIA_DATA_FOLDER"))) dir.create(Sys.getenv("EUNOMIA_DATA_FOLDER"))
if (!eunomiaIsAvailable()) downloadEunomiaData(datasetName = "synpuf-1k")

Introduction

In this example we're going to summarise the characteristics of individuals with an ankle sprain, ankle fracture, forearm fracture, or a hip fracture using the Eunomia synthetic data.

We'll begin by creating our study cohorts.

library(CDMConnector)
library(CohortConstructor)
library(CodelistGenerator)
library(PhenotypeR)
library(dplyr)
library(ggplot2)

con <- DBI::dbConnect(duckdb::duckdb(), 
                      CDMConnector::eunomiaDir("synpuf-1k", "5.3"))
cdm <- CDMConnector::cdmFromCon(con = con, 
                                cdmName = "Eunomia Synpuf",
                                cdmSchema   = "main",
                                writeSchema = "main", 
                                achillesSchema = "main")

cdm$injuries <- conceptCohort(cdm = cdm,
  conceptSet = list(
    "ankle_sprain" = 81151,
    "ankle_fracture" = 4059173,
    "forearm_fracture" = 4278672,
    "hip_fracture" = 4230399
  ),
  name = "injuries")
cdm$injuries |> 
  glimpse()

Summarising code use

To get a good understanding of the codes we've used to define our cohorts we can use the codelistDiagnostics() function.

code_diag <- codelistDiagnostics(cdm$injuries)

From our results we can see a table of counts of our codes in our database based on the achilles results.

tableAchillesCodeUse(code_diag)

And we can also see orphan codes, which are codes that we did not include in our cohort definition but maybe could have. These are codes being used in the database that are associated with codes we included in our definition. So although many can be false positives, we may identify some codes that we may want to our cohort definitions.

tableOrphanCodes(code_diag)

addCodelistAttribute

Some cohorts that may be created manually may not have the codelists recorded in the cohort_codelist attribute. The package has a utility function to record a codelist in a cohort_table object:

cohortCodelist(cdm$injuries, cohortId = 1)
cdm$injuries <- cdm$injuries |>
  addCodelistAttribute(codelist = list(new_codelist = c(1L, 2L)), cohortName = "ankle_fracture")
cohortCodelist(cdm$injuries, cohortId = 1)


Try the PhenotypeR package in your browser

Any scripts or data that you put into this service are public.

PhenotypeR documentation built on April 3, 2025, 10:46 p.m.