getPlpData: Get the patient level prediction data from the server
In ted9219/CoDImputationOnlyDeathPop: Cause of death imputation package

Description Usage Arguments Details Value

This function executes a large set of SQL statements against the database in OMOP CDM format to extract the data needed to perform the analysis.

getPlpData(
  connectionDetails,
  cdmDatabaseSchema,
  oracleTempSchema = cdmDatabaseSchema,
  cohortId,
  outcomeIds,
  studyStartDate = "",
  studyEndDate = "",
  cohortDatabaseSchema = cdmDatabaseSchema,
  cohortTable = "cohort",
  outcomeDatabaseSchema = cdmDatabaseSchema,
  outcomeTable = "cohort",
  cdmVersion = "5",
  firstExposureOnly = FALSE,
  washoutPeriod = 0,
  sampleSize = NULL,
  covariateSettings,
  excludeDrugsFromCovariates = FALSE
)

`connectionDetails`	An R object of type `connectionDetails` created using the function `createConnectionDetails` in the `DatabaseConnector` package.
`cdmDatabaseSchema`	The name of the database schema that contains the OMOP CDM instance. Requires read permissions to this database. On SQL Server, this should specifiy both the database and the schema, so for example 'cdm_instance.dbo'.
`oracleTempSchema`	For Oracle only: the name of the database schema where you want all temporary tables to be managed. Requires create/insert permissions to this database.
`cohortId`	A unique identifier to define the at risk cohort. CohortId is used to select the cohort_concept_id in the cohort-like table.
`outcomeIds`	A list of cohort_definition_ids used to define outcomes (-999 mean no outcome gets downloaded).
`studyStartDate`	A calendar date specifying the minimum date that a cohort index date can appear. Date format is 'yyyymmdd'.
`studyEndDate`	A calendar date specifying the maximum date that a cohort index date can appear. Date format is 'yyyymmdd'. Important: the study end data is also used to truncate risk windows, meaning no outcomes beyond the study end date will be considered.
`cohortDatabaseSchema`	The name of the database schema that is the location where the cohort data used to define the at risk cohort is available. Requires read permissions to this database.
`cohortTable`	The tablename that contains the at risk cohort. cohortTable has format of COHORT table: cohort_concept_id, SUBJECT_ID, COHORT_START_DATE, COHORT_END_DATE.
`outcomeDatabaseSchema`	The name of the database schema that is the location where the data used to define the outcome cohorts is available. Requires read permissions to this database.
`outcomeTable`	The tablename that contains the outcome cohorts. Expectation is outcomeTable has format of COHORT table: COHORT_DEFINITION_ID, SUBJECT_ID, COHORT_START_DATE, COHORT_END_DATE.
`cdmVersion`	Define the OMOP CDM version used: currently support "4", "5" and "6".
`firstExposureOnly`	Should only the first exposure per subject be included? Note that this is typically done in the `createStudyPopulation` function, but can already be done here for efficiency reasons.
`washoutPeriod`	The mininum required continuous observation time prior to index date for a person to be included in the at risk cohort. Note that this is typically done in the `createStudyPopulation` function, but can already be done here for efficiency reasons.
`sampleSize`	If not NULL, only this number of people will be sampled from the target population (Default NULL)
`covariateSettings`	An object of type `covariateSettings` as created using the `createCovariateSettings` function in the `FeatureExtraction` package.
`excludeDrugsFromCovariates`	A redundant option

Based on the arguments, the at risk cohort data is retrieved, as well as outcomes occurring in these subjects. The at risk cohort is identified through user-defined cohorts in a cohort table either inside the CDM instance or in a separate schema. Similarly, outcomes are identified through user-defined cohorts in a cohort table either inside the CDM instance or in a separate schema. Covariates are automatically extracted from the appropriate tables within the CDM. If you wish to exclude concepts from covariates you will need to manually add the concept_ids and descendants to the excludedCovariateConceptIds of the covariateSettings argument.

Returns an object of type plpData, containing information on the cohorts, their outcomes, and baseline covariates. Information about multiple outcomes can be captured at once for efficiency reasons. This object is a list with the following components:

outcomes: A data frame listing the outcomes per person, including the time to event, and the outcome id. Outcomes are not yet filtered based on risk window, since this is done at a later stage.
cohorts: A data frame listing the persons in each cohort, listing their exposure status as well as the time to the end of the observation period and time to the end of the cohort (usually the end of the exposure era).
covariates: An ffdf object listing the baseline covariates per person in the two cohorts. This is done using a sparse representation: covariates with a value of 0 are omitted to save space.
covariateRef: An ffdf object describing the covariates that have been extracted.
metaData: A list of objects with information on how the cohortMethodData object was constructed.