knitr::opts_chunk$set( collapse = TRUE, comment = "#>", eval = FALSE )
TestGenerator helps you test pharmacoepidemiological study code against a small, explicit OMOP CDM test population. The typical workflow is:
This vignette uses the ICU sample population included with the package.
An Excel input file should contain one sheet per OMOP CDM table. For example,
the sheet names can include person, observation_period, visit_occurrence,
condition_occurrence, drug_exposure, and measurement.
library(TestGenerator) file_path <- system.file( "extdata", "icu_sample_population.xlsx", package = "TestGenerator" ) output_path <- file.path(tempdir(), "testgenerator-example") dir.create(output_path, showWarnings = FALSE, recursive = TRUE) readPatients( filePath = file_path, testName = "icu_sample", outputPath = output_path, cdmVersion = "5.4" )
This writes icu_sample.json to output_path. Keeping these JSON files in
tests/testthat/testCases makes them easy to reuse from package tests. When
outputPath = NULL, TestGenerator writes to that default test case folder.
Use patientsCDM() to create a CDM reference containing the small patient
population and a complete vocabulary. By default, the CDM is created in DuckDB.
cdm <- patientsCDM( pathJson = output_path, testName = "icu_sample", cdmVersion = "5.4" ) cdm[["person"]]
If pathJson = NULL, TestGenerator looks for JSON files in
tests/testthat/testCases.
cdm <- patientsCDM( pathJson = NULL, testName = "icu_sample", cdmVersion = "5.4" )
Once the test CDM is available, run the same study code you use on a real CDM.
The package includes example cohort definitions under inst/extdata/test_cohorts.
library(CDMConnector) library(dplyr) library(testthat) test_cohorts <- system.file( "extdata", "test_cohorts", package = "TestGenerator" ) cohort_set <- readCohortSet(test_cohorts) cdm <- generateCohortSet( cdm = cdm, cohortSet = cohort_set, name = "test_cohorts" ) cohort_attrition <- attrition(cdm[["test_cohorts"]]) excluded_records <- cohort_attrition |> pull(excluded_records) |> sum() expect_equal(excluded_records, 0)
In a package test, place this code in tests/testthat/test-*.R and assert the
specific counts, dates, durations, or intersections that your study should
produce for the micro population.
If you want to design a new test population from scratch, create an Excel workbook with the required CDM table columns.
generateTestTables( tableNames = c( "person", "observation_period", "visit_occurrence", "condition_occurrence", "drug_exposure", "measurement" ), cdmVersion = "5.4", outputFolder = output_path, filename = "my_test_population" )
Fill in the workbook rows for the small set of patients needed by your test,
then pass the completed workbook to readPatients().
For CSV inputs, place one file per CDM table in a folder. File names should
match the table names, for example person.csv and observation_period.csv.
csv_path <- system.file( "extdata", "mimic_sample", package = "TestGenerator" ) readPatients.csv( filePath = csv_path, testName = "mimic_sample", outputPath = output_path, cdmVersion = "5.4" )
For source datasets with very large integer identifiers, set
reduceLargeIds = TRUE.
For local DuckDB examples, disconnect when the test has finished.
DBI::dbDisconnect(CDMConnector::cdmCon(cdm), shutdown = TRUE) unlink(output_path, recursive = TRUE)
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.