library(CDMConnector) if (Sys.getenv("EUNOMIA_DATA_FOLDER") == "") Sys.setenv("EUNOMIA_DATA_FOLDER" = file.path(tempdir(), "eunomia")) if (!dir.exists(Sys.getenv("EUNOMIA_DATA_FOLDER"))) dir.create(Sys.getenv("EUNOMIA_DATA_FOLDER")) if (!eunomiaIsAvailable()) downloadEunomiaData() knitr::opts_chunk$set( collapse = TRUE, eval = rlang::is_installed("duckdb"), comment = "#>" )
First let's load the required packages for the code in this vignette. If you haven't already installed them, all the other packages can be installed using ´install.packages()´
library(CDMConnector) library(dplyr, warn.conflicts = FALSE) library(ggplot2)
Now let´s connect to a duckdb database with the Eunomia data (https://github.com/OHDSI/Eunomia).
con <- DBI::dbConnect(duckdb::duckdb(), dbdir = eunomiaDir()) cdm <- cdmFromCon(con, cdmName = "eunomia", cdmSchema = "main", writeSchema = "main") cdm
This cdm object is now what we´ll use going forward. It provides a reference to the OMOP CDM tables. We can see that these tables are still in the database, but now we have a reference to each of the ones we might want to use in our analysis. For example, the person table can be referenced like so
Say we want to make a histogram of year of birth in the person table. We can select that variable, bring it into memory, and then use ggplot to make the histogram.
cdm$person %>% select(year_of_birth) %>% collect() %>% ggplot(aes(x = year_of_birth)) + geom_histogram(bins = 30)
If we wanted to make a boxplot for length of observation periods we could do the computation on the database side, bring in the new variable into memory, and use ggplot to produce the boxplot
cdm$observation_period %>% select(observation_period_start_date, observation_period_end_date) %>% mutate(observation_period = (observation_period_end_date - observation_period_start_date)/365, 25) %>% select(observation_period) %>% collect() %>% ggplot(aes(x = observation_period)) + geom_boxplot()
We use show_query to check the sql that is being run against duckdb
cdm$person %>% tally() %>% show_query()
cdm$person %>% summarise(median(year_of_birth))%>% show_query()
cdm$person %>% mutate(gender = case_when( gender_concept_id == "8507" ~ "Male", gender_concept_id == "8532" ~ "Female", TRUE ~ NA_character_))%>% show_query()
DBI::dbDisconnect(con, shutdown = TRUE)
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.