knitr::opts_chunk$set(echo = TRUE) options(kableExtra.latex.load_packages = FALSE) library(kableExtra) #knitr::knit_hooks$set(document = function(x) {sub('\\usepackage[]{color}', '\\usepackage[table]{xcolor}', x, fixed = TRUE)}) library(dplyr) options(knitr.kable.NA = "") # pdf2png <- function(path) { # # only do the conversion for non-LaTeX output # if (knitr::is_latex_output()) { # return(path) # } # path2 <- xfun::with_ext(path, "png") # img <- magick::image_read_pdf(path) # magick::image_write(img, path2, format = "png") # path2 # } latex_table_font_size <- 8
abbreviations <- readr::read_delim(col_names = FALSE, delim = ";", trim_ws = TRUE, file = " CDM; Common data model DPP4; Dipeptidyl peptidase-4 GLP1; Glucagon-like peptide-1 IRB; Institutional review board LEGEND; Large-scale Evidence Generation and Evaluation across a Network of Databases MACE; Major adverse cardiovascular event MDRR; Minimum detectable risk ratio OHDSI; Observational Health Data Science and Informatics OMOP; Observational Medical Outcomes Partnership PS; Propensity score RCT; Randomized controlled trial SGLT2; Sodium-glucose co-transporter-2 T2DM; Type 2 diabetes mellitus ") tab <- kable(abbreviations, col.names = NULL, linesep = "", booktabs = TRUE) if (knitr::is_latex_output()) { tab %>% kable_styling(latex_options = "striped", font_size = latex_table_font_size) } else { tab %>% kable_styling(bootstrap_options = "striped") }
LEGEND-T2DM consists of several R
packages available on https://github.com/ohdsi-studies:
LegendT2dm
LegendT2dmCohortExplorer
andLegendT2dmEvidenceExplorer
that depend themselves on a large number of packages from Hades (add link).
To gracefully handle these dependencies, LEGEND-T2DM uses renv
through a version-controlled renv.lock
file to specify each package.
To install LegendT2dm
and all of its dependencies, we first "pull" the lastest source from github
:
git clone https://github.com/ohdsi-studies/LegendT2dm
then from within an R
or Rstudio
terms, we active renv
:
install.packages("renv") # Only necessary once renv::activate() renv::restore()
after which one can now build and install the final LegendT2dm
package.
For the Class-vs-Class Study, we will construct $$ \begin{aligned} & (4 \text{ drug-class targets})\ \times \ & (2 \text{ On-treatment censoring types})\ \times \ & (2 \text{ Prior metformin-use choices}) \ & = 16 \end{aligned} $$ exposure cohort definitions. Likewise, for the Drug-vs-Drug Study, we will construct $$ \begin{aligned} & (22 \text{ drug-ingredient targets})\ \times \ & (2 \text{ On-treatment censoring types})\ \times \ & (2 \text{ Prior metformin-use choices}) \ & = 88 \end{aligned} $$ exposure cohort definitions. For the Heterogeneity Study, we will stratify each of these cohort definitions into $$ \begin{aligned} & 3 \text{ (age: 18-44 / 45-64 /}\ge\text{65) } + \ & 2 \text{ (sex: female / male) } + \ & 1 \text{ (race: Black) } + \ & 2 \text{ (cardiovascular risk: lower / higher) } + \ & 2 \text{ (renal impairment: with/without) } \ & = 10 \end{aligned} $$ additional definitions, results in $(16 + 88) \times 10 = 1040$ further cohort definitions.
This section documents internal coding used across the three studies to label cohort definitions and their SQL representation. Each cohort carries 9-digit identifier $$ \text{ID}\ [DD][T][M][A][S][R][C][K] $$ where Table \@ref(tab:drug-codes) reports the target 2-digit drug codes $DD$ and Table \@ref(tab:category-codes) lists the codes for time-at-risk $T$, metformin $M$, age $A$, sex $S$, race $R$, cardiovascular $C$ and renal (kidney) $K$ categories.
addTab <- function(name, cohortId) { if (cohortId %% 10 != 0) { paste0("\\ \\ \\ \\ ", name) } else { name } } exposures <- read.csv(system.file("settings/ExposuresOfInterest.csv", package = "LegendT2dm")) drugCodes <- exposures %>% arrange(cohortId) %>% rowwise() %>% mutate(name = addTab(name, cohortId)) %>% select(name, cohortId) %>% rename(Drug = name, "2-digit Code" = cohortId) tab <- kable(drugCodes, booktabs = TRUE, linesep = "", caption = "2-Digit Drug Codes") %>% column_spec(1, width = "10em") %>% column_spec(2, width = "30em") if (knitr::is_latex_output()) { tab %>% kable_styling(latex_options = "striped", font_size = latex_table_font_size) } else { tab %>% kable_styling(bootstrap_options = "striped") }
categoryCodes <- readr::read_delim(col_names = TRUE, delim = ";", trim_ws = TRUE, file = " Category; Value; 1-digit Code Time-at-risk; ; ITT/OT1; 1 ; OT2; 2 Metformin; ; prior; 1 ; none; 2 Age ; ; any; 0 ; younger; 1 ; middle; 2 ; older; 3 Sex ; ; any; 0 ; female; 1 ; male; 2 Race ; ; any; 0 ; Black; 1 Cardiovascular risk ; ; any; 0 ; low; 1 ; higher; 2 Renal disease ; ; any; 0 ; without; 1 ; with; 2 ") footnote <- c("ITT: intent-to-treat; OT1: on-treatment-1; OT2: on-treatment-2") tab <- kable(categoryCodes, booktabs = TRUE, linesep = "", caption = "Other Cohort Category Codes") %>% footnote(general = footnote, general_title = "") if (knitr::is_latex_output()) { tab %>% kable_styling(latex_options = "striped", font_size = latex_table_font_size) } else { tab %>% kable_styling(bootstrap_options = "striped") }
Here are useful regex
masks for the different strata
mask <- readr::read_csv(system.file("settings", "masks.csv", package = "LegendT2dm")) tab <- kable(mask, booktabs = TRUE, linesep = "", caption = "Regex masks for strata") # %>% # footnote(general = footnote, general_title = "") if (knitr::is_latex_output()) { tab %>% kable_styling(latex_options = "striped", font_size = latex_table_font_size) } else { tab %>% kable_styling(bootstrap_options = "striped") }
For each comparison within each study, we will execute the 7 CohortMethod
analysis plans in Table \@ref(tab:plan-matrix).
design <- readr::read_delim(col_names = TRUE, delim = ";", trim_ws = TRUE, file = " ID; PS-covariates; PS-adjustment; Outcome model; Risk-window end 1; ; ; Cox; target exposure end 2; FE defaults; variable match; Cox (conditional); target exposure end 3; FE defaults; 5 strata; Cox (conditional); target exposure end 4; ; ; Cox; observational period end 5; FE defaults; variable match; Cox (conditional); observational period end 6; FE defaults; 5 strata; Cox (conditional); observational period end 7; ; ; Cox; target exposure end or escalation 8; FE defaults; variable match; Cox (conditional); target exposure end or escalation 9; FE defaults; 5 strata; Cox (conditional); target exposure end or escalation ") footnote <- c("FE: FeatureExtraction package (365 days long-term, 180 days medium-term and 30 days short-term look-back)", "PS: Propensity score") tab <- kable(design, linesep = "", booktabs = TRUE, caption = "CohortMethod analysis descriptions") %>% footnote(general = footnote, general_title = "") if (knitr::is_latex_output()) { tab %>% kable_styling(latex_options = "striped", font_size = latex_table_font_size) } else { tab %>% kable_styling(bootstrap_options = "striped") }
The LEGEND-T2DM studies depend on several configuration files.
We directly provide some of these files, while others we generate through executable R
scripts before distributing study packages.
R/CreateAnalysisDetails.R
extra/GenerateExposureCohortDefinitions.R
baseCohort.json
: Base cohort object that we programmatically manipulate to generate all class
and drug
exposure cohorts
inst/settings/ExposuresOfInterest.csv
: List of RxNorm
concepts for individual drug
ingredient and Drug class
specifications for all exposure cohorts. File columns:
type
: Drug
for individual ingredients or Drug class
for ingredient classesname
: Unsure if currently used; longer namecohortId
: Unique concept ID used to make cohort IDconceptId
: RxNorm
concept ID if individual ingredient, otherwise unique Drug class
ID in c(1, 2, 3, 4)
class
: Reference to Drug class
conceptId
if individual ingredient, otherwise NA
shortName
: Unsure if currently used; short nameorder
: Unsure if currently used; order to displayincludedConceptIds
: RxNorm concept IDs if Drug class
, otherwise NA
inst/settings/OutcomesOfInterest.csv
: List of outcome cohorts to import into study package and instantiate during execution. File columns:
cohortId
: Unique cohort ID inside study packageatlasId
: Unique cohort ID on JnJ ATLAS instance (currently) for importing into packageatlasName
: Short name for Cohort Diagnostics
name
: File prefix name for .json
and .sql
specificationdescription
: Human-readable, short description of cohortcite
: List of Rmd
citation keysisNew
: Indicate a new cohort for LEGEND-T2DM
inst/settings/Indications.csv
: List of study-specific settings. File columns:
indicationId
: Unique study ID key, e.g., class
indicationName
: Study name, e.g., Class-vs-class
filterConceptIds
: Concept IDs to remove from propensity score modelspositiveControlIdOffset
: Offset with which to start numbering positive control outcomesclass
studiesThe following files are generated by extra/GenerateExposureCohortDefinitions.R
:
inst/settings/classCohortsToCreate.csv
: List of exposure cohorts to create, including both main and heterogeneity
study-stratified cohorts. File columns:atlasId
: Unique cohort IDatlasName
: Short name that CohortDiagnostics
will displaycohortId
: Must == atlasId
(unsure if used)name
: File prefix name for .json
and .sql
specification
inst/settings/classTcosOfInterest.csv
: List of TC pairs to compare in the class
study. File name includes Tco
and some columns for historical reasons; all outcomes are executed by default. File columns :
targetId
: Target cohort IDcomparatorId
: Comparator cohort IDoutcomeIds
: Currently unused; defaults to -1
excludedCovariateConceptIds
: Currently unused; defaults to NA
targetName
: Unsure if currently used; short name for target cohortcomparatorName
Unsure if currently used; short name for comparator cohortThe following files specify the diagnostics and results data models on the public Postgresql server:
inst/settings/PsAssessmentModelSpecs.csv
: Propensity score assessment additions to the legendt2dm_*_diagnostics
schema
inst/settings/ResultsModelSpecs.csv
: All study results for the legendt2dm_*_results
schema
These files serve as arguments to createDataModelSqlFile
to generate
inst/sql/postgresql/CreatePsAssessmentTables.sql
inst/sql/postgresql/CreateResultsTables.sql
and to uploadResultsToDatabase
to upload diagnostics and results to the public Postgresql server.
CohortMethod
function argumentsThis document describes the data models for storing the output of the CohortDiagnostics
and CohortMethod
study artifacts for the LEGEND-T2DM study.
Table \@ref(tab:schema) provides the schema
names.
schema <- readr::read_delim(col_names = FALSE, delim = ";", trim_ws = TRUE, file = " legendt2dm_class_diagnostics ; legendt2dm_class_results legendt2dm_drug_diagnostics ; legendt2dm_drug_results legendt2dm_outcome_diagnostics ") tab <- kable(schema, col.names = c("Cohort diagnostics", "Study results"), linesep = "", booktabs = TRUE, caption = "LEGEND-T2DM public study schema") if (knitr::is_latex_output()) { tab %>% kable_styling(latex_options = "striped", font_size = latex_table_font_size) } else { tab %>% kable_styling(bootstrap_options = "striped") }
Some fields contain patient counts or fractions that are easily converted to patient counts. To prevent identifiability, these fields are subject to a minimum value. When the value falls below this minimum, it is replaced with the negative value of the minimum. For example, if the minimum subject count is 5, and the actual count is 2, the value stored in the data model will be -5, which could be represented as '\<5' to the user. Note that the value 0 is permissible, as it identifies no persons.
specifications <- readr::read_csv(system.file("/settings/resultsDataModelSpecification.csv", package = "CohortDiagnostics")) %>% filter(optional == "No") tables <- split(specifications, specifications$tableName) for (table in tables) { header <- sprintf("### Table %s", table$tableName[1]) table <- table %>% select(Field = .data$fieldName, Type = .data$type, Key = .data$primaryKey # , Description = .data$description ) %>% kable(linesep = "", booktabs = TRUE, longtable = TRUE) if (knitr::is_latex_output()) { writeLines("") writeLines(header) writeLines(table %>% kable_styling(latex_options = "striped", font_size = latex_table_font_size) %>% column_spec(1, width = "10em") %>% column_spec(2, width = "5em") %>% column_spec(3, width = "3em") %>% column_spec(4, width = "16em")) } else if (knitr::is_html_output()) { writeLines("") writeLines(header) writeLines(table %>% kable_styling(bootstrap_options = "striped")) } }
specifications <- readr::read_csv(system.file("settings/PsAssessmentModelSpecs.csv", package = "LegendT2dm")) tables <- split(specifications, specifications$tableName) for (table in tables) { header <- sprintf("### Table %s", table$tableName[1]) table <- table %>% select(Field = .data$fieldName, Type = .data$type, Key = .data$primaryKey, Description = .data$description ) %>% kable(linesep = "", booktabs = TRUE, longtable = TRUE) if (knitr::is_latex_output()) { writeLines("") writeLines(header) writeLines(table %>% kable_styling(latex_options = "striped", font_size = latex_table_font_size) %>% column_spec(1, width = "10em") %>% column_spec(2, width = "5em") %>% column_spec(3, width = "3em") %>% column_spec(4, width = "16em")) } else if (knitr::is_html_output()) { writeLines("") writeLines(header) writeLines(table %>% kable_styling(bootstrap_options = "striped")) } }
specifications <- readr::read_csv(system.file("settings/ResultsModelSpecs.csv", package = "LegendT2dm")) tables <- split(specifications, specifications$tableName) for (table in tables) { header <- sprintf("### Table %s", table$tableName[1]) table <- table %>% select(Field = .data$fieldName, Type = .data$type, Key = .data$primaryKey, Description = .data$description ) %>% kable(linesep = "", booktabs = TRUE, longtable = TRUE) if (knitr::is_latex_output()) { writeLines("") writeLines(header) writeLines(table %>% kable_styling(latex_options = "striped", font_size = latex_table_font_size) %>% column_spec(1, width = "10em") %>% column_spec(2, width = "5em") %>% column_spec(3, width = "3em") %>% column_spec(4, width = "16em")) } else if (knitr::is_html_output()) { writeLines("") writeLines(header) writeLines(table %>% kable_styling(bootstrap_options = "striped")) } }
All LegendT2dm
tables are located on OHDSI's public PostgreSQL
server that is the backend for the apps on (https://data.ohdsi.org).
Study PIs can provide read-only credentials that can be stored encrypted on individual computers.
For this latter task, we recommend using the cross-platform keyring
R
package to securely store: legendt2dmServer
, legendt2dmDatabase
, legendt2dmUser
, legendt2dmPassword
and legendT2dmDriverPath
and specifying connection details via:
Sys.setenv(DATABASECONNECTOR_JAR_FOLDER = keyring::key_get("legendT2dmDriverPath")) legendT2dmConnectionDetails <- DatabaseConnector::createConnectionDetails( dbms = "postgresql", server = paste(keyring::key_get("legendt2dmServer"), keyring::key_get("legendt2dmDatabase"), sep = "/"), user = keyring::key_get("legendt2dmUser"), password = keyring::key_get("legendt2dmPassword"))
To explore cohorts and their diagnostics through an R Shiny application, we use the LegendT2dmCohortExplorer
package.
The following code will load all class
cohorts:
LegendT2dmCohortExplorer::launchCohortExplorer(cohorts = "class", connectionDetails = legendT2dmConnectionDetails)
Use cohorts = "drug"
and cohorts = "outcome"
to explore the Drug-vs-Drug Study exposure and outcome cohorts, respectively.
This Shiny applications is based on CohortDiagnostics v2.1
and it is also possible to explore cohorts using local files by specifying the dataFolder
argument.
A useful Cohort name
filter is main(
(see Table \@ref(tab:category-codes)).
To explore the comparative effectiveness and safety evidence through an R Shiny application, we use the LegendT2dmEvidenceExplorer
package.
The following code will load all class
cohort comparison evidence:
LegendT2dmEvidenceExplorer::launchEvidenceExplorer(cohorts = "class", connectionDetails = legendT2dmConnectionDetails)
Use cohorts = "drug"
to explore the drug
cohort comparison evidence.
It is also possible explore evidence using local files by specifying the dataFolder
argument.
Orphan concepts are (source) concepts that likely should be included in a concept set, but aren't, often because of vocabulary issues. This is a rough outline of the process:
The LegendT2dm
provides functions to render cohort definitions into Rmarkdown
for output to PDF or HTML.
Here is an example of rendering the complete definition for new-users of SGLT2 inhibitors:
cohortJson <- SqlRender::readSql(system.file("cohorts", "class", "ID301100000.json", package = "LegendT2dm")) LegendT2dm::printCohortDefinitionFromNameAndJson( name = "Class-vs-Class Exposure (SGLT2 New-User) Cohort / OT1", json = cohortJson)
If possible, LEGEND-T2DM publications should be drafted in Rmarkdown
with *.Rmd
source files version-controlled on github
.
Initial drafting can begin in a private repository, but final versions should end up in ohdsi-studies/LegendT2dm/Documents
.
Currently, we have the following documents:
Protocol/Protocol.Rmd
: official protocol registered (soon) with EU PASSProtocol/MedRvix.Rmd
: reformatted protocol for medRvix
, available at: (add url)Protocol/BmjOpen.Rmd
: reformatted protocol for BMJOpen (to be submitted)Nothing yet.
We have a cultural war here between PowerPoint
, latex
and Rmarkdown
.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.