extract_data: Extract data by users' requirements.

Description Usage Arguments Value See Also Examples

View source: R/extract_data.R

Description

This function extracts data based on the request forms users have filled and saved in the request_output folder of selected research folder.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
extract_data(
  wkdir = getwd(),
  research.folder = NA,
  inclusion.xls.file = NA,
  variable.xls.file = NA,
  database = NA,
  dataLogic = NA,
  select.output = NA,
  overwrite = TRUE,
  username = NA,
  password = NA
)

Arguments

wkdir

The path to working directory. See initWkdir for details on a working directory.

research.folder

The name of research folder. See initResearchFolder for details on a research folder.

inclusion.xls.file

Name(s) of request form(s) with inclusion criterion. Multiple request forms should be specified as a character vector.

variable.xls.file

Name(s) of request form(s) with variable lists. Multiple request forms should be specified as a character vector. Default is NA, where no variable list is specified, and variables in the inclusion criteria will be extracted instead.

database

Name of database. Should be either private or public for flat tables, indicating whether data is stored in public_data or research/[research folder]/private_data, or the actual name of database if extracting data from a database.

dataLogic

Whether to take union or intersection of inclusion criteria, if multiple criteria specified. Default is NA.

select.output

1 to generate lists of identifier variables from merged inclusion criteria; 2 to generate an Excel file with summary statistics for both inclusion criteria and variable lists; 3 to generate csv files with data extracted based on each request form; 4 to generate a single csv file for the final merged data. Multiple selection should be specified as a vector.

overwrite

Whether to overwrite existing request form. Default is TRUE.

username

User name for accessing database if data.type is not flat. Default is NA for flat tables.

password

Password for accessing database if data.type is not flat. Default is NA for flat tables.

conn_string

Connection string for accessing ORE server. Default is NA.

Value

Returns a list of identifier variables, path to the Excel file with summary statistics, extracted data and merged data, if any of these are selected with selected.output. These are also written as csv files in research/[research folder]/request_output folder.The summary.xls is returned. The summary.xls will includes count summary sheet and variable summary sheet.

See Also

genInclusion, genVariable

Examples

1
2
3
4
5
6
7
## Not run: 
extract_data(wkdir = "Working directory", research.folder = "requestnum001",
             inclusion.xls.file = "inclusion.Diagnosis_DIAGNOSIS_CD(DIAGNOSIS_DESC_ICD_VERSION)",
             variable.xls.file = "variable.Patient(PATIENT_NRIC)",
             select.output = c(1, 2, 4))

## End(Not run)

biostatUniBS/RDataXMan documentation built on Feb. 2, 2021, 9:41 a.m.