extract_ho | R Documentation |
Query an RSQLite database and return a data frame with a 0/1 vector depending on whether each individual has at least one observation with relevant code between a specified time period.
extract_ho(
cohort,
varname = NULL,
codelist = NULL,
codelist_vector = NULL,
indexdt,
t = NULL,
t_varname = TRUE,
time_prev = Inf,
time_post = 0,
numobs = 1,
db_open = NULL,
db = NULL,
db_filepath = NULL,
tab = c("observation", "drugissue", "hes_primary", "death"),
out_save_disk = FALSE,
out_subdir = NULL,
out_filepath = NULL,
return_output = TRUE
)
cohort |
Cohort of individuals to extract the 'history of' variable for. |
varname |
Name of variable in the outputted data frame. |
codelist |
Name of codelist (stored on hard disk) to query the database with. |
codelist_vector |
Vector of codes to query the database with. This takes precedent over |
indexdt |
Name of variable in |
t |
Number of days after |
t_varname |
Whether to alter the variable name in the outputted data frame to reflect |
time_prev |
Number of days prior to index date to look for codes. |
time_post |
Number of days after index date to look for codes. |
numobs |
Number of obesrvations required to return a value of 1. |
db_open |
An open SQLite database connection created using RSQLite::dbConnect, to be queried. |
db |
Name of SQLITE database on hard disk (stored in "data/sql/"), to be queried. |
db_filepath |
Full filepath to SQLITE database on hard disk, to be queried. |
tab |
Table name to query in SQLite database. |
out_save_disk |
If |
out_subdir |
Sub-directory of "data/extraction/" to save outputted data frame into. |
out_filepath |
Full filepath and filename to save outputted data frame into. |
return_output |
If |
Specifying db
requires a specific underlying directory structure. The SQLite database must be stored in "data/sql/" relative to the working directory.
If the SQLite database is accessed through db
, the connection will be opened and then closed after the query is complete. The same is true if
the database is accessed through db_filepath
. A connection to the SQLite database can also be opened manually using RSQLite::dbConnect
, and then
using the object as input to parameter db_open
. After wards, the connection must be closed manually using RSQLite::dbDisconnect
. If db_open
is specified, this will take precedence over db
or db_filepath
.
If out_save_disk = TRUE
, the data frame will automatically be written to an .rds file in a subdirectory "data/extraction/" of the working directory.
This directory structure must be created in advance. out_subdir
can be used to specify subdirectories within "data/extraction/". These options will use a default naming convention. This can be overwritten
using out_filepath
to manually specify the location on the hard disk to save. Alternatively, return the data frame into the R workspace using return_output = TRUE
and then save onto the hard disk manually.
Codelists can be specified in two ways. The first is to read the codelist into R as a character vector and then specify through the argument
codelist_vector
. Codelists stored on the hard disk can also be referred to from the codelist
argument, but require a specific underlying directory structure.
The codelist on the hard disk must be stored in a directory called "codelists/analysis/" relative to the working directory. The codelist must be a .csv file, and
contain a column "medcodeid", "prodcodeid" or "ICD10" depending on the input for argument tab
. The input to argument codelist
should just be a character string of
the name of the files (excluding the suffix '.csv'). The codelist_vector
option will take precedence over the codelist
argument if both are specified.
A data frame with a 0/1 vector and patid. 1 = presence of code within the specified time period.
## Connect
aurum_extract <- connect_database(file.path(tempdir(), "temp.sqlite"))
## Create SQLite database using cprd_extract
cprd_extract(aurum_extract,
filepath = system.file("aurum_data", package = "rcprd"),
filetype = "observation", use_set = FALSE)
## Define cohort and add index date
pat<-extract_cohort(system.file("aurum_data", package = "rcprd"))
pat$indexdt <- as.Date("01/01/1955", format = "%d/%m/%Y")
## Extract a history of type variable prior to index date
extract_ho(pat,
codelist_vector = "187341000000114",
indexdt = "fup_start",
db_open = aurum_extract,
tab = "observation",
return_output = TRUE)
## clean up
RSQLite::dbDisconnect(aurum_extract)
unlink(file.path(tempdir(), "temp.sqlite"))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.