In OHDSI/CemConnector: OHDSI Common Evidence Model (CEM) Connector

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
source(file.path("..", "tests", "testthat", "helper.R"))
sqlidb <- tempfile(fileext = ".sqlite")
connectionDetails <- DatabaseConnector::createConnectionDetails(dbms = "sqlite", server = sqlidb)
.loadCemTestFixtures(connectionDetails)
cemConnection <- CemConnector::CemDatabaseBackend$new(connectionDetails = connectionDetails,
                                                      vocabularyDatabaseSchema = "main",
                                                      cemDatabaseSchema = "main",
                                                      sourceDatabaseSchema = "main")

```{css echo=FALSE} h1 { margin-top: 60px; }

# Introduction
If you're reading this guide it is expected that you have some familiarity with the common evidence model (CEM) and have an installation set up and configured.
Please refer to (the CEM github repository)[https://github.com/OHDSI/CommonEvidenceModel] for more information.

The following guide highlights how to use the CemConnector package (following installation instructions provided in the readme).

# Connecting
CemConnector provides two methods to connect to a Cem that are functionally equivalent: connecting via a web api and connecting via a database.

## Using CEMConnector with a web backend
The simplest way to connect with the CEM is to connect to a host web instance, for example the public API available at https://cem.ohdsi.org/ (NOTE: TBD).
To do this we will use a web backend.
First we need to create a connection to the web api:

```r

cemConnection <- CemConnector::createCemConnection(apiUrl = "https://cem.ohdsi.org/")

Using CEMConnector with a database backend

Alternatively, a connection can be made with a DatabaseConnector::connectionDetails object. Contact your CEM administrator for details connecting this way.

connectionDetails <- DatabaseConnector::createConnectionDetails(user = "mydbusername",
                                                                server = "myserver/foo",
                                                                dbms = "redshift",
                                                                password = "mysecret")

cemConnection <- CemConnector::createCemConnection(connectionDetails = connectionDetails,
                                                   vocabularyDatabaseSchema = "vocabulary",
                                                   cemDatabaseSchema = "cem_v2",
                                                   sourceDatabaseSchema = "cem_v2_source")

This should connect to the server without error.

Viewing evidence sources

The available evidence sources in the Cem are visible as follows:

cemConnection$getCemSourceInfo()

This highlights who, what, where and when the data sources come from.

Constructing conceptset defintions

To generate concept sets we recommend using PHEOBE. This is a simple way of finding standard concepts (SNOMED or RxNorm) that relate to diseases and outcomes of interest for which real world evidence has been observed. For this example we will use the concept for moderate depression as a disease outcome and the ingredient exposure codeine. It is important to note that the CEM stores information in the standard vocabulary only. Similarly drug concepts are mapped at the ingredient level, so you will want to map specific ingredient concepts not drug formulations. For example, instead of using a specific drug dose such as 'codeine 30 MG' it would be advisable to use only the ingredient 'codeine'. Naturally, designing concept sets is highly specific to a phenotype of interest and careful consideration should go into the concepts used.

In this example, we would like to both find any existing evidence that relates these two diseases as well as select potential negative controls that can be used to calibrate p-values and confidence intervals from effect estimates that we may generate in observational studies.

Concept sets can also be imported from ATLAS. See the ROhdsiWebApi package for more information on how to use this in R.

CemConnector requires 3 fields for searching for relationships: conceptId, includeDescendants and isExcluded. Other fields can be present in the data.frame, but will be unused in searches.

Negative control suggestions

Negative controls are one of the most common ways to use the information within the CEM in observational health research. Cem Connector is capable of suggesting negative controls for use in population level effect estimate studies. Negative controls are intended to be exposures or outcomes selected because they are believed to have no direct causal relationship with the outcome of interest. For example, in-grown toe-nails are not caused by exposure to ACE inhibitors. However, in-grown toenails are extremely common, and many patients will experience this outcome while exposed to the drug. Consequently, any observed effect is a product of residual bias (e.g. from the study design choices) and this can be used to produce calibrated effect estimates.

The design of a PLE study is outside the scope of this viggnette, however, the selection of outcomes/exposures can be achieved at the concept level. Here, the CemConnector takes a concept set as a parameter and finds outcomes our exposures that have no mapped relationships within the CEM. We note that these mappings are not perfect and human curation by experienced clinicians is advised.

To find negative control outcomes for the above ingredient set use the following code, for example the drug Codene.

ingredientConceptSet <- data.frame(conceptId = c(1201620), includeDescendants = c(1), isExcluded = c(0))
outcomeConcepts <- cemConnection$getSuggestedControlCondtions(ingredientConceptSet)
outcomeConcepts

Alternatively, the use of ingredient exposures may be considered (i.e. drugs thought not to cause or treat a given condition). For example, when searching for depressive disorder and descendants:

conditionConceptSet <- data.frame(conceptId = c(440383), includeDescendants = c(1), isExcluded = c(0))
outcomeConcepts <- cemConnection$getSuggestedControlIngredients(conditionConceptSet)
outcomeConcepts

In both cases the row order indicates the rank of the negative control suggestions. This is simply how commonly a co-occurrence has been observed in the OHDSI network. For example, in-grown toe nails and acetominophen are very commonly observed, yet will likely have no medical relationship between drugs and patient outcomes, respectively.

Exploring evidence

The cem contains a It is often desirable to find out what information relates exposures and outcomes. The cem provides a variety of information, including drug product labels, literature searches, clinical trials data and spontaneous adverse event reporting data.

Searching for conditions related to exposures

The simplest drug concept set is of one ingredient, for example the drug Codene.

ingredientConceptSet <- data.frame(conceptId = c(1201620), includeDescendants = c(1), isExcluded = c(0))
outcomeConcepts <- cemConnection$getIngredientEvidenceSummary(ingredientConceptSet)

Though not necessary for this ingredient set, we allow the utility to look for descendants. This may be useful for an entire class of medications such as the ATC classification Other opioids:

parentIngredientSet <- data.frame(conceptId = c(21604296), includeDescendants = c(1), isExcluded = c(0))

We would like to search for any related disease concepts to this query. Cem connector provides a simple evidence summary for all resulting concepts that are in the database.

outcomeConcepts <- cemConnection$getIngredientEvidenceSummary(parentIngredientSet)
outcomeConcepts

The resulting data frame is of ingredient concepts with a 1 or 0 if any evidence exists within the CEM.

For an item here, we may want to look at the specific evidence that is involved. To do this we would look at an exposure outcome pair from the concept set.

To get all the evidence found in the CEM for the ingredient concept set, what source its from and what the counts or statistics are, we can look at the entire search universe.

conditionEvidence <- cemConnection$getIngredientEvidence(parentIngredientSet)
conditionEvidence

Searching for drug ingredients related to conditions

Searching for conditions can be achieved similar manner. In this example we search for evidence related to Dysthymia.

conditionConceptSet <- data.frame(conceptId = c(433440), includeDescendants = c(1), isExcluded = c(0))
cemConnection$getConditionEvidenceSummary(ingredientConceptSet)

cemConnection$getConditionEvidence(ingredientConceptSet)

The mapping between conditions and ingredients is often complicated, for example moderate depressive disorder may not match exactly. Consequently, it may be desirable to explore siblings of the related concept, using the concept ancestry. This can be achieved with the siblingLookupLevels parameter, which is an integer tha can be used to search higher levels in the concept ancestry:

minorDepressionConceptSet <- data.frame(conceptId = c(440383), includeDescendants = c(0), isExcluded = c(0))
cemConnection$getConditionEvidence(minorDepressionConceptSet, siblingLookupLevels = 1)

However, this approach is likely to result in low specificity, when searching for related concepts. It is strongly advised to consider a revised concept set, especially when searching for negative controls.

Searching for evidence pairs

The relationships between specific ingredient and condition based concept sets are often highly informative. A goal of Cem connector is allowed exploration of this evidence. For example, we may be interested in what evidence (if any) relates codene with Dysthymia, this can be achieved with the use of the relationships method:

conditionConceptSet <- data.frame(conceptId = c(433440), includeDescendants = c(1), isExcluded = c(0))
ingredientConceptSet <- data.frame(conceptId = c(1201620), includeDescendants = c(1), isExcluded = c(0))
cemConnection$getRelationships(conditionConceptSet = conditionConceptSet, ingredientConceptSet = ingredientConceptSet)

Here we see a number of relationship definitions that relate the two concept sets, including PubMed literature (via mesh terms) as well as spontaneous reports for adverse eventsand product labels. Refer to cemConnection$getCemSourceInfo() as this will strongly relate to the data sources as well as the date of generation.

Study packages

We can use the CemConnector package to pull down a selection of negative control suggestions to use within a population level effect estimate study. For an example of a PLE study, see the skeleton package:

https://github.com/OHDSI/SkeletonComparativeEffectStudy

These are outcome pairs for target exposures that have no mapping to literature searches, drug product labels, clinical trials or statistically significant spontaneous adverse event reports.

remotes::install_github("ohdsi/CemConnector")
# URL tbc
cemUrl <- "https://cem.ohdsi.org"
cemConnection <- CemConnector::createCemConnection(apiUrl = cemUrl)
# Celecoxib/Diclofenac cohort concept set
conceptSet <- data.frame(conceptId = c(1118084, 1124300), includeDescendants = TRUE, isExcluded = FALSE)

Here we selected a concept set that defines the ingredients in our target and comparator exposures. The concept set search will automatically exclude any outcomes for which evidence exists. Therefore, it is vital to select this concept set with care.

suggestedControls <- cemConnection$getSuggestedControlCondtions(conceptSet, nControls = 100)

As evidence mapping is imperfect, it is highly advisable to view the suggested negative controls and refine them before adding them to the study package. The nControls parameter allows the selection of many controls that can be filtered, by default this is set to 50, the order returned is ranked according to how often the condition/drug pairing appears in OHDSI network studies. The selected controls can be appended to the study package, making sure to specify the target and comparator correctly:

CemConnector::addControlsToStudyPackage(controlConcepts = filteredControls,
                                        targetId = targetId,
                                        comparatorId = comparatorId,
                                        fileName = "inst/settings/NegativeControls.csv",
                                        type = "outcome")

We also need logic to convert the concept IDs into cohorts. This logic is defined in the file inst/sql/NegativeControlOutcomes.sql. The default logic simply creates a cohort for every occurrence of the concept ID or any of its descendants in the condition era table. If other logic is required the SQL file can be modified. Be sure to use template SQL as expected by the SqlRender package.

OHDSI/CemConnector documentation built on Aug. 5, 2023, 2:47 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Tweet to @rdrrHQ

GitHub issue tracker

ian@mutexlabs.com