find_exam_ram: Find exam data within a given timeframe using parallel CPU...

View source: R/find_exam_ram.R

find_exam_ramR Documentation

Find exam data within a given timeframe using parallel CPU computing without shared RAM management.

Description

Finds all, earliest or closest examination to a given timepoints using parallel computing

Usage

find_exam_ram(
  d_from,
  d_to,
  d_from_ID = "ID_MERGE",
  d_to_ID = "ID_MERGE",
  d_from_time = "time_rad_exam",
  d_to_time = "time_enc_admit",
  time_diff_name = "timediff_exam_to_db",
  before = TRUE,
  after = TRUE,
  time = 1,
  time_unit = "days",
  multiple = "closest",
  add_column = NULL,
  keep_data = FALSE,
  nThread = parallel::detectCores() - 1
)

Arguments

d_from

data table, the database which is searched to find examinations within the timeframe.

d_to

data table, the database to which we wish to find examinations within the timeframe.

d_from_ID

string, column name of the patient ID column in d_from. Defaults to ID_MERGE.

d_to_ID

string, column name of the patient ID column in d_to. Defaults to ID_MERGE.

d_from_time

string, column name of the time variable column in d_from. Defaults to time_rad_exam.

d_to_time

string, column name of the time variable column in d_to. Defaults to time_enc_admit.

time_diff_name

string, column name of the new column created which holds the time difference between the exam and the time provided by d_to. Defaults to timediff_exam_to_db.

before

boolean, should times before the given time be considered. Defaults to TRUE.

after

boolean, should times after the given time be considered. Defaults to TRUE.

time

integer, the timeframe considered between the exam and the d_to timepoints. Defaults to 1.

time_unit

string, the unit of time used. It is passed on to the units argument of difftime. "secs", "mins", "hours", "days" and "weeks" are supported.

multiple

string, which exams to give back. closest gives back the exam closest to the time provided by d_to. all gives back all occurrences within the timeframe. earliest the earliest exam within the timeframe. In case of ties for closest or earliest, all are returned. Defaults to closest.

add_column

string, a column name in d_to to add to the output. Defaults to NULL.

keep_data

boolean, whether to include empty rows with only the d_from_ID column filed out for cases that have data in the d_from, but not within the time range. Defaults to FALSE.

nThread

integer, number of threads to use by dopar for parallelization. If it is set to 1, then no parallel backends are created and the function is executed sequentially.

Value

data table, with d_from filtered to ones only within the timeframe. The columns of d_from are returned with the corresponding time column in data_to where the rows are instances which comply with the time constraints specified by the function. An additional column specified in time_diff_name is also returned, which shows the time difference between the time column in d_from and d_to for that given case. Also the time column from d_to specified by d_to_time is returned under the name of time_to_db. An additional column specified in add_column may be added from data_to to the data table.


parseRPDR documentation built on March 31, 2023, 11:36 p.m.