In meerapatelmd/OncoRegimenFinder: Derive Oncology Treatment Regimens from the OMOP CDM

knitr::opts_chunk$set(
  tidy = 'styler',
  collapse = TRUE,
  comment = "#>",
  cache = TRUE
)

Overview

The goal of OncoRegimenFinder is to write tables that derive antineoplastic drug combinations from every patient in the OMOP CDM instance. The execution occurs in 2 overarching steps:
1. Subsetting the Drug Exposure table for antineoplastic drugs
2. Executing an algorithm that derives antineoplastic drug combinations based on overlaps in the window of time created by adding and subtracting the date_lag_input in days from the start date of the drug exposure. This initial resultset is then joined onto itself for a number of iterations determined by regimen_repeats, the maximum number of drugs in derived drug combinations.

It is important to note that the procedures shown below will run an algorithm with a set date_lag_input and regimen_repeats across the entire OMOP CDM Drug Exposure Table. Though the ultimate goal is to enable retrieval of derived drug combinations from the final writeDatabaseSchema.regimenIngredientTable via a join on the person_id field, specific cohorts may require different date_lag_input and regimen_repeats. For example, esophageal carcinoma patients tend to have regimens executed in blocks of 2 weeks instead of 30 days and the algorithm may return more accurate results with a date_lag_input of 15 days. In special cases such as this it, a second run of OncoRegimenFinder with modified parameters should be run with customized table names.

General Procedure Notes

If any of the tables written in this package already exist in the given schema, the existing table is renamed with the system date appended in the format {table}_YYYY_mm_dd. If {table}_YYYY_mm_dd already exists, the process is skipped and the main table (ie {table}) is dropped.

Setup

Setup involves writing an Ingredient Exposure Table and a Vocabulary Table that serve as the data sources for the remainder of the OncoRegimenFinder procedure.

Setup Requirements

A database connection object conn to your OMOP CDM instance
writeDatabaseSchema destination schema with write privileges

Outputs

The objective of the setup is to map all data sources to RxNorm Ingredients, which will serve as the point of alignment between Drug Exposures and a Regimen. This involves creating the following tables in writeDatabaseSchema:
1. Ingredient Exposure Table (drugExposureIngredientTable): Drug Exposure Table with an additional ingredient_concept_id field that references the RxNorm Ingredient that corresponds to the drug_concept_id in the OMOP Vocabulary.
1. Vocabulary Table (vocabularyTable): In a similar fashion as the Ingredient Exposure Table, HemOnc Regimens are mapped to mapped to 1 or more RxNorm Ingredients.

Maintenance

The Ingredient Exposure Table should be rewritten according to the cadence in which the Drug Exposures Table source is updated.
The Vocabulary Table should be rewritten whenever there is an update to HemOnc, RxNorm, or RxNorm Extension in the OMOP Vocabulary.

Procedure

Load Packages

library(OncoRegimenFinder)
library(tidyverse)

library(fantasia)
conn <- fantasia::connectOMOP()

Set Parameters

cdmDatabaseSchema: schema housing the Concept and Drug Exposure Table
writeDatabaseSchema: schema to write the Ingredient Exposure and Vocabulary Table to
drugExposureIngredientTable: name of the Ingredient Exposure Table
vocabularyTable: name of the Vocabulary Table

# Schema for the Concept and Drug Exposure Tables
cdmDatabaseSchema <- "omop_cdm_2"

# Output Schema and Tables
writeDatabaseSchema <- "oncoregimenfinder"
drugExposureIngredientTable <- "ingredient_exposure"
vocabularyTable <- "vocabulary"

Vocabulary Table

Write

OncoRegimenFinder::createVocabTable(conn = conn,
                 writeDatabaseSchema = writeDatabaseSchema,
                 cdmDatabaseSchema = cdmDatabaseSchema,
                 vocabularyTable = vocabularyTable)

Row Count History

grep(pattern = paste0("^",vocabularyTable),
    pg13::lsTables(conn = conn,
                   schema = writeDatabaseSchema),
    ignore.case = T,
    value = TRUE) %>% 
        rubix::map_names_set(function(x) pg13::query(conn = conn,
                                          pg13::renderRowCount(schema = writeDatabaseSchema,
                                                   tableName = x))) %>% 
        dplyr::bind_rows(.id = "Table") %>% 
        dplyr::rename(RowCount = count)

At a Glance

vocabularyTableData <-
  pg13::query(conn = conn,
              sql_statement = pg13::buildQuery(schema = writeDatabaseSchema,
                                               tableName = vocabularyTable,
                                               n = 20,
                                               n_type = "random"))

vocabularyTableData

Ingredient Exposure Table

Write Ingredient Exposure Table

OncoRegimenFinder::buildIngredientExposuresTable(conn = conn,
                                                 cdmDatabaseSchema = cdmDatabaseSchema,
                                                 writeDatabaseSchema = writeDatabaseSchema,
                                                 drugExposureIngredientTable = drugExposureIngredientTable)

Row Count History

grep(pattern = paste0("^",drugExposureIngredientTable),
    pg13::lsTables(conn = conn,
                   schema = writeDatabaseSchema),
    ignore.case = T,
    value = TRUE) %>% 
        rubix::map_names_set(function(x) pg13::query(conn = conn,
                                          pg13::renderRowCount(schema = writeDatabaseSchema,
                                                   tableName = x))) %>% 
        dplyr::bind_rows(.id = "Table") %>% 
        dplyr::rename(RowCount = count)

At a Glance

drugExposureIngredientTableData <-
  pg13::query(conn = conn,
              sql_statement = pg13::buildQuery(schema = writeDatabaseSchema,
                                               tableName = drugExposureIngredientTable,
                                               n = 20,
                                               n_type = "random"))

drugExposureIngredientTableData %>%
  dplyr::select(ingredient_concept_id,
                starts_with("drug_exposure"))

fantasia::dcOMOP(conn = conn)

Run Algorithm

The algorithm can be run once the tables are written in [Setup].

Requirements

A database connection object conn to your OMOP CDM instance
writeDatabaseSchema destination schema with write privileges
writeDatabaseSchema.vocabularyTable & writeDatabaseSchema.drugExposureIngredientTable Tables (see ["Setup"])

Outputs

Intermediate

Cohort Table cohortTable: Ingredient Exposure table filtered for all the RxNorm Ingredients that are descendants of a superclass comprised of one or more OMOP Vocabulary Concept Classes Concept Ids (drug_classification_id_input). By default, drug_classification_id_input is set to Concept Classes of OncoRegimenFinder::atc_antineoplastic_id and OncoRegimenFinder::hemonc_classes, but is modifiable according to user preference.
Regimen Staging Table regimenStagingTable: A copy of the Cohort Table. This table is written to serve as the data source for running OncoRegimenFinder against a cohort requiring non-standard date_lag_input and regimen_repeats parameters.

Final

Regimen Table regimenTable: Regimen Staging Table that is run through the OncoRegimenFinder algorithm according to the default date_lag_input and regimen_repeats values of 30 days and 5, respectively.
Regimen Ingredient Table regimenIngredientTable: Regimen Ingredient Table that maps the algorithm output in the Regimen Table to HemOnc Regimens in the vocabularyTable written in [Setup].

Procedure

Load Packages

library(OncoRegimenFinder)
library(tidyverse)

library(fantasia)
conn <- fantasia::connectOMOP()

Set Parameters

# Schema of Source Person and Drug Exposure Tables
cdmDatabaseSchema <- "omop_cdm_2"

# Output Schema and Tables
writeDatabaseSchema <- "oncoregimenfinder"
drugExposureIngredientTable <- "ingredient_exposure"
cohortTable <- "cohort"
regimenTable <- "regimen"
regimenStagingTable <- "regimen_staging"
vocabularyTable <- "vocabulary"
regimenIngredientTable <- "regimen_ingredients"

# OMOP Vocabulary Drug Classes to filter Drug Exposures for
drug_classification_id_input <- c(#ATC Antineoplastics (OncoRegimenFinder::atc_antineoplastics_id)
                                 21601387, 
                                 #HemOnc Classes (OncoRegimenFinder::hemonc_classes)
                                 35101847, 
                                 35807195, 
                                 35807466, 
                                 35807470, 
                                 35807489)
false_positive_id <- 
                #OncoRegimenFinder::falsepositives
                c(45775396, 1304850, 792993, 19010482, 19089602, 
                  19080458, 19104221, 19065450, 1510328, 42903942, 
                  1354698, 1551860, 19003472, 1356009, 40244464, 
                  1303425, 745466, 1389464, 19025194, 740910, 1710612, 
                  35606631, 19003999, 985708, 45775206, 1308432, 1522957, 
                  1760616, 950637, 1300978, 44816310, 1500211, 1506270, 
                  923645, 19061406, 40171288, 1388796, 40168303, 975125, 
                  1507705, 1511449, 1738521, 1548195, 1518254, 40222444, 
                  1713332, 1112807, 19014878, 19034726, 984232, 1525866, 
                  989482, 1550557, 1551099, 924120, 904351)

# Date difference when assessing for drug combinations in the Drug Exposures table
date_lag_input <- 30
regimen_repeats <- 5

Steps

Cohort, Regimen, and a copy of Regimen as "Regimen Staging" Tables are written.
Regimen Table undergoes further processing by combining overlapping instances of an ingredient start date and +/- the date lag input over the number of regimen repeats.
A final Regimen Ingredient Table joins the Vocabulary Table with the Regimen Table to map drug combinations derived from this algorithm back to the HemOnc ontology and other OMOP Concepts

Write Cohort and Regimen Staging Tables

OncoRegimenFinder::buildCohortRegimenTable(conn = conn,
                                           cdmDatabaseSchema = cdmDatabaseSchema,
                                           writeDatabaseSchema = writeDatabaseSchema,
                                           cohortTable = cohortTable,
                                           regimenTable = regimenTable,
                                           drug_classification_id_input = drug_classification_id_input,
                                           false_positive_id = false_positive_id)

Cohort Table

The Cohort Table selects for the Person Id, Drug Exposure Id with Start and End Dates for all Drug Concept Ids that are RxNorm Ingredients representing that Drug Exposure filtered for all descendants of the curated Drug Cl argument.

At a Glance

cohortTableData <-
  pg13::query(conn = conn,
              sql_statement = pg13::buildQuery(schema = writeDatabaseSchema,
                                               tableName = cohortTable,
                                               n = 20,
                                               n_type = "random"))

cohortTableData

Row Count History

grep(pattern = cohortTable,
    pg13::lsTables(conn = conn,
                   schema = writeDatabaseSchema),
    ignore.case = T,
    value = TRUE) %>% 
        rubix::map_names_set(function(x) pg13::query(conn = conn,
                                          pg13::renderRowCount(schema = writeDatabaseSchema,
                                                   tableName = x))) %>% 
        dplyr::bind_rows(.id = "Table") %>% 
        dplyr::rename(RowCount = count)

Regimen Staging Table

The final Regimen Table is derived from the Regimen Staging Table is also to source of Regimens used to process special cohort tables. The Regimen Staging Table also serves as the data source when running OncoRegimenFinder against special cohorts.

At a Glance

regimenStagingTableData <-
  pg13::query(conn = conn,
              sql_statement = pg13::buildQuery(schema = writeDatabaseSchema,
                                               tableName = regimenStagingTable,
                                               n = 20,
                                               n_type = "random"))

regimenStagingTableData

Row Count History

grep(pattern = regimenStagingTable,
    pg13::lsTables(conn = conn,
                   schema = writeDatabaseSchema),
    ignore.case = T,
    value = TRUE) %>% 
        rubix::map_names_set(function(x) pg13::query(conn = conn,
                                          pg13::renderRowCount(schema = writeDatabaseSchema,
                                                   tableName = x))) %>% 
        dplyr::bind_rows(.id = "Table") %>% 
        dplyr::rename(RowCount = count)

Process Regimen Table

The Regimen Table is grouped by Person Id, Drug Exposure Id, Ingredient and Ingredient Start Date and joined onto itself based on overlapping window of time by +/= the Date Lag Input parameter and iterated on based on the Regimen Repeats given.

OncoRegimenFinder::processRegimenTable(conn = conn,
                    writeDatabaseSchema = writeDatabaseSchema,
                    regimenTable = regimenTable,
                    date_lag_input = date_lag_input,
                    regimen_repeats = regimen_repeats)

Regimen Table

At a Glance

regimenTableData <-
  pg13::query(conn = conn,
              sql_statement = pg13::buildQuery(schema = writeDatabaseSchema,
                                               tableName = regimenTable,
                                               n = 20,
                                               n_type = "random"))

regimenTableData

Row Count History

grep(pattern = paste(regimenStagingTable, regimenIngredientTable, sep = "|"),
    pg13::lsTables(conn = conn,
                   schema = writeDatabaseSchema),
    ignore.case = T,
    value = TRUE,
    invert = TRUE) %>% 
  purrr::map(function(x) grep(regimenTable, x, ignore.case = T, value = TRUE)) %>% 
  purrr::keep(~length(.)==1) %>% 
  unlist() %>% 
  rubix::map_names_set(function(x) pg13::query(conn = conn,
                                          pg13::renderRowCount(schema = writeDatabaseSchema,
                                                   tableName = x))) %>% 
        dplyr::bind_rows(.id = "Table") %>% 
        dplyr::rename(RowCount = count)

Write Regimen Ingredient Table

createRegimenIngrTable(conn = conn,
                       writeDatabaseSchema = "patelm9",
                       cohortTable = "oncoregimenfinder_cohort",
                       regimenTable = "oncoregimenfinder_regimen",
                       regimenIngredientTable = "oncoregimenfinder_regimen_ingredients",
                       vocabularyTable = "oncoregimenfinder_vocabulary")

Regimen Ingredient Table

At a Glance

regimenIngrTableData <-
  pg13::query(conn = conn,
              sql_statement = pg13::buildQuery(schema = writeDatabaseSchema,
                                               tableName = regimenIngredientTable,
                                               n = 20,
                                               n_type = "random"))

regimenIngrTableData

Row Count History

grep(pattern = regimenIngredientTable,
    pg13::lsTables(conn = conn,
                   schema = writeDatabaseSchema),
    ignore.case = T,
    value = TRUE) %>% 
        rubix::map_names_set(function(x) pg13::query(conn = conn,
                                          pg13::renderRowCount(schema = writeDatabaseSchema,
                                                   tableName = x))) %>% 
        dplyr::bind_rows(.id = "Table") %>% 
        dplyr::rename(RowCount = count)

fantasia::dcOMOP(conn = conn,
                 remove = TRUE)

meerapatelmd/OncoRegimenFinder documentation built on Jan. 1, 2021, 9:25 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

meerapatelmd/OncoRegimenFinder Derive Oncology Treatment Regimens from the OMOP CDM

In meerapatelmd/OncoRegimenFinder: Derive Oncology Treatment Regimens from the OMOP CDM

Overview

General Procedure Notes

Setup

Setup Requirements

Outputs

Maintenance

Procedure

Load Packages

Set Parameters

Vocabulary Table

Write

Row Count History

At a Glance

Ingredient Exposure Table

Write Ingredient Exposure Table

Row Count History

At a Glance

Run Algorithm

Requirements

Outputs

Intermediate

Final

Procedure

Load Packages

Set Parameters

Steps

Write Cohort and Regimen Staging Tables

Cohort Table

At a Glance

Row Count History

Regimen Staging Table

At a Glance

Row Count History

Process Regimen Table

Regimen Table

At a Glance

Row Count History

Write Regimen Ingredient Table

Regimen Ingredient Table

At a Glance

Row Count History

R Package Documentation

Browse R Packages

We want your feedback!

meerapatelmd/OncoRegimenFinder
Derive Oncology Treatment Regimens from the OMOP CDM