load_base: Helper function for loading RPDR data into R.

View source: R/load_base.R

load_baseR Documentation

Helper function for loading RPDR data into R.

Description

Helper function to load different datasources from RPDR. Should not be used on its own.

Usage

load_base(
  file,
  merge_id = "EMPI",
  sep = ":",
  id_length = "standard",
  perc = 0.6,
  na = TRUE,
  identical = TRUE,
  nThread = parallel::detectCores() - 1,
  mrn_type = FALSE,
  src = "mrn",
  fill = FALSE,
  sep_load = "|"
)

Arguments

file

string, full file path to given RPDR txt.

merge_id

string, column name to use to create ID_MERGE column used to merge different datasets. Defaults to EMPI, as it is the preferred MRN in the RPDR system. In case of mrn dataset, leave at EMPI, as it is automatically converted to: "Enterprise_Master_Patient_Index".

sep

string, divider between hospital ID and MRN. Defaults to :.

id_length

string, indicating whether to modify MRN length based-on required values id_length = standard, or to keep lengths as is id_length = asis. If id_length = standard then in case of MGH, BWH, MCL, EMPI and PMRN the length of the MRNs are corrected accordingly by adding zeros, or removing numeral from the beginning. In other cases the lengths are unchanged. Defaults to standard.

perc

numeric, a number between 0-1 indicating which parsed ID columns to keep. Data present in perc x 100% of patients are kept.

na

boolean, whether to remove columns with only NA values. Defaults to TRUE.

identical

boolean, whether to remove columns with identical values. Defaults to TRUE.

nThread

integer, number of threads to use to load data.

mrn_type

boolean, should data in MRN_Type and MRN be parsed. Defaults to FALSE, as it is not advised to parse these for all data sources as it takes considerable time.

src

string, what is the three letter source ID of the file, such as dem.

Value

data table, with minimally parsed data and the raw data.

ID_MERGE

numeric, defined IDs by merge_id, used for merging later.

ID_src_EMPI

string, EMPI IDs from src datasource, if the datasource is not mrn. Data is formatted using pretty_mrn().

ID_src_PMRN

string, PMRN IDs from src datasource, if the datasource is not mrn. Data is formatted using pretty_mrn().

ID_scr_loc

string, from datasource src, if mrn_type == TRUE, then the data in MRN_Type and MRN are parsed into IDs corresponding to locations (loc). Data is formatted using pretty_mrn().


parseRPDR documentation built on June 24, 2024, 5:16 p.m.