Tools for handling data in the GPRD, HES, ONS and MINAP linked dataset (CALIBER)
This package contains four sets of tools:
Functions to import single or multiple files to data.table or ffdf objects in R, with automatic unzipping of compressed files and conversion of dates, and applying lookups. (
A 'cohort' S3 class to store information about a cohort,
and functions for generating analysis variables from multiple row per patient data
Producing summary tables in LaTeX or plain text, with functions
to format numbers and percentages. (
Producing forest plots using a spreadsheet template, including the facility to include several plots side by side, and specify the formatting of text. (
This package uses the data.table package extensively. Data tables can be modified by reference and are fast and efficient at handling large datasets. There are also functions to use ffdf data frames, which allow huge datasets to be stored in a temporary folder on the hard disk but appear as R objects in the workspace.
The package includes tools for date conversion in CALIBER files and tools for selecting values of a repeat measure or a diagnosis for patients within a particular time window.
The CALIBERlookups package, if installed, can provide lookup tables for the function
extractEntity. The CALIBERcodelists package is useful for creating codelists, but is not required for this package to work.
Denaxas et al. Data Resource Profile: Cardiovascular disease research using linked bespoke studies and electronic health records (CALIBER). Int. J. Epidemiol. (2012) 41 (6): 1625-1638. doi: 10.1093/ije/dys188 http://ije.oxfordjournals.org/content/41/6/1625
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
# A sample patient cohort file mycohort <- cohort(data.table(anonpatid = 1:3, indexdate = c('2010-01-01', '2009-03-05', '2008-05-06'), deathdate = c(NA, '', '2009-09-08'), ethnic_hes = c('Black', 'White', 'Indian'))) convertDates(mycohort) print(mycohort) # A sample data file with repeat measures for some patients mydata <- data.table(anonpatid = c(2, 2, 3), eventdate = as.IDate(c('2006-01-01', '2008-01-01', '2005-01-01')), data1 = c(1, 2, 3)) # Copy the index dates and ethnicity to the repeated measures file. transferVariables(mycohort, mydata, c('indexdate', 'ethnic_hes')) print(mydata) # Now use them to do a calculation on the repeated measures. mydata[, temp:= ifelse(ethnic_hes == 'White', data1, 2)] # Select a summary measure using addToCohort addToCohort(mycohort, 'newvar', data = mydata, old_varname = 'temp', value_choice = c(2, 1)) print(mycohort)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.