import_CPRD_data: Imports all selected CPRD data into an sqlite database
In rOpenHealth/rEHR: Manipulating and Analysing Electronic Health Record Data

import_CPRD_data

R Documentation

Imports all selected CPRD data into an sqlite database

Description

This function can import from both cohorts downloaded via the CPRD online tool and CPRD GOLD builds

Usage

import_CPRD_data(db, data_dir, filetypes = c("Additional", "Clinical",
  "Consultation", "Immunisation", "Patient", "Practice", "Referral", "Staff",
  "Test", "Therapy"), dateformat = "%d/%m/%Y", yob_origin = 1800,
  regex = "PET", recursive = TRUE, ...)

Arguments

`db`	a database connection
`data_dir`	the directory containing the CPRD cohort data
`filetypes`	character vector of filetypes to be imported
`dateformat`	the format that dates are stored in the CPRD data. If this is wrong it won't break but all dates are likely to be NA
`yob_origin`	value to add yob values to to get actual year of birth (Generally 1800)
`regex`	character regular expression to identify data files in the directory. This is separated from the filetype by an underscore. e.g. 'p[0-9]3' in CPRD GOLD
`recursive`	logical should files be searched for recursively under the data_dir?
`...`	arguments to be passed to add_to_database

Details

Note that if you chose to import all the filetype, you may end up with a very large database file. You may then chose only to import the files you want to use. You can always import the rest of the files later. This function may take a long time to process because it unzips (potentially large) files, reads into R where it converts the date formats before importing to SQLite. However, this initial data preparation step will greatly accelerate downstream processing.

rOpenHealth/rEHR documentation built on Sept. 25, 2024, 5:32 p.m.