This work references data from the Medical Information Mart for Intensive Care MIMIC-III database v1.4. MIMIC-III is a large, freely-available database comprising de-identified health-related data from patients who were admitted to the critical care units of the Beth Israel Deaconess Medical Center from 2001-2019. Detailed information can be obtained on the MIMIC-III website.
MIMIC3db provides a tightly bound set of routines to reference a SQLite version of the MIMIC-III v1.4 database on a local network using base access routines from the RSQLite R package.
You can install MIMIC3db from GitHub with:
# install.packages("devtools")
devtools::install_github("hdshea/MIMIC3db")
Assumptions:
db
directory defined in it,db
directory, anddb
directory retaining their default namesNOTE: you can put the database and the SQL scripts in any directory that
you like; this example just uses a directory named db
in the base
project directory as an example.
These assumptions having been met, the following code run in the terminal from the base directory will:
NOTE: the zip file containing the MIMIC-III v1.4 raw data files is 6.2 GB and the built database on my MacBook Pro M1 local hard drive is 85.6 GB.
cd db
sqlite3 MIMIC-III.db < mimic3_create_script.sql
sqlite3 MIMIC-III.db < mimic3_load_small_tables.sql
sqlite3 MIMIC-III.db < mimic3_load_CHARTEVENTS.sql
sqlite3 MIMIC-III.db < mimic3_load_DATETIMEEVENTS.sql
sqlite3 MIMIC-III.db < mimic3_load_INPUTEVENTS_CV.sql
sqlite3 MIMIC-III.db < mimic3_load_INPUTEVENTS_MV.sql
sqlite3 MIMIC-III.db < mimic3_load_LABEVENTS.sql
sqlite3 MIMIC-III.db < mimic3_load_NOTEEVENTS.sql
sqlite3 MIMIC-III.db < mimic3_load_OUTPUTEVENTS.sql
sqlite3 MIMIC-III.db < mimic3_load_PRESCRIPTIONS.sql
sqlite3 MIMIC-III.db < fix_data_fields.sql
sqlite3 MIMIC-III.db < mimic3_add_indexes.sql
cd ..
This code then shows the tables in the new database.
base_dir <- here::here("")
db_file <- fs::path(base_dir, "db/MIMIC-III.db")
if(RSQLite::dbCanConnect(RSQLite::SQLite(), db_file)) {
con <- RSQLite::dbConnect(RSQLite::SQLite(), db_file)
} else {
stop(stringr::str_c("Database file: ", db_file, " not found.", sep=""))
}
RSQLite::dbListTables(con)
#> [1] "ADMISSIONS" "CALLOUT" "CAREGIVERS"
#> [4] "CHARTEVENTS" "CPTEVENTS" "DATETIMEEVENTS"
#> [7] "DIAGNOSES_ICD" "DRGCODES" "D_CPT"
#> [10] "D_ICD_DIAGNOSES" "D_ICD_PROCEDURES" "D_ITEMS"
#> [13] "D_LABITEMS" "ICUSTAYS" "INPUTEVENTS_CV"
#> [16] "INPUTEVENTS_MV" "LABEVENTS" "MICROBIOLOGYEVENTS"
#> [19] "NOTEEVENTS" "OUTPUTEVENTS" "PATIENTS"
#> [22] "PRESCRIPTIONS" "PROCEDUREEVENTS_MV" "PROCEDURES_ICD"
#> [25] "SERVICES" "TRANSFERS"
RSQLite::dbDisconnect(con)
Assuming that you have loaded the appropriate version of the MIMIC-III
database into a database at db/MIMIC-III.db
, then the following code
will set up a connection con
to that database.
library(MIMIC3db)
base_dir <- here::here("")
db_file <- fs::path(base_dir, "db/MIMIC-III.db")
if(RSQLite::dbCanConnect(RSQLite::SQLite(), db_file)) {
con <- RSQLite::dbConnect(RSQLite::SQLite(), db_file)
} else {
stop(stringr::str_c("Database file: ", db_file, " not found.", sep=""))
}
This is a basic example showing the use of the two base functions.
patients <- db_select_data(con, "SELECT * FROM PATIENTS")
patients
#> # A tibble: 0 × 8
#> # … with 8 variables: ROW_ID <int>, SUBJECT_ID <int>, GENDER <chr>, DOB <dbl>,
#> # DOD <dbl>, DOD_HOSP <dbl>, DOD_SSN <dbl>, EXPIRE_FLAG <int>
admissions <- db_get_from_table(con, "ADMISSIONS", where = "WHERE SUBJECT_ID = 10006")
admissions
#> # A tibble: 0 × 18
#> # … with 18 variables: SUBJECT_ID <int>, HADM_ID <int>, ADMITTIME <dttm>,
#> # DISCHTIME <dttm>, DEATHTIME <dttm>, ADMISSION_TYPE <chr>,
#> # ADMISSION_LOCATION <chr>, DISCHARGE_LOCATION <chr>, INSURANCE <chr>,
#> # LANGUAGE <chr>, RELIGION <chr>, MARITAL_STATUS <chr>, ETHNICITY <chr>,
#> # EDREGTIME <dttm>, EDOUTTIME <dttm>, DIAGNOSIS <chr>,
#> # HOSPITAL_EXPIRE_FLAG <int>, HAS_CHARTEVENTS_DATA <int>
This is a basic example showing the use of a few of the tables specific data access functions.
patients <- db_get_patients(con)
patients
#> # A tibble: 0 × 7
#> # … with 7 variables: SUBJECT_ID <int>, GENDER <chr>, DOB <dttm>, DOD <dttm>,
#> # DOD_HOSP <dttm>, DOD_SSN <dttm>, EXPIRE_FLAG <int>
admissions <- db_get_admissions(con, where = "WHERE SUBJECT_ID = 10006")
admissions
#> # A tibble: 0 × 18
#> # … with 18 variables: SUBJECT_ID <int>, HADM_ID <int>, ADMITTIME <dttm>,
#> # DISCHTIME <dttm>, DEATHTIME <dttm>, ADMISSION_TYPE <chr>,
#> # ADMISSION_LOCATION <chr>, DISCHARGE_LOCATION <chr>, INSURANCE <chr>,
#> # LANGUAGE <chr>, RELIGION <chr>, MARITAL_STATUS <chr>, ETHNICITY <chr>,
#> # EDREGTIME <dttm>, EDOUTTIME <dttm>, DIAGNOSIS <chr>,
#> # HOSPITAL_EXPIRE_FLAG <int>, HAS_CHARTEVENTS_DATA <int>
(NOTE: The example database I use for testing has only empty tables. That is why there are no results returned in the examples calls above.)
This is how you disconnect from the database when finished.
RSQLite::dbDisconnect(con)
Johnson, A., Pollard, T., & Mark, R. (2016). MIMIC-III Clinical Database (version 1.4). PhysioNet. https://doi.org/10.13026/C2XW26.
PhysioNet: Goldberger, A., Amaral, L., Glass, L., Hausdorff, J., Ivanov, P. C., Mark, R., … & Stanley, H. E. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation [Online]. 101 (23), pp. e215–e220.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.