dataPreprocess: Preprocess Data to Be Fed into Joint Models

Description Usage Arguments Value Note Author(s) Examples

View source: R/dataPreprocess.R

Description

dataPreprocess is a function to preprocess data to be used in fitting joint models. Suppose the situation is that the longitudinal measurements are recorded in a data frame with one row per measurment and the survival information are recorded in another data frame with one row per subject. This function merges the two data frames by subject identification and generate three new columns: start, stop, event. See Value.

Usage

1
2
dataPreprocess(long, surv, id.col, long.time.col, surv.time.col, surv.event.col, 
               surv.event.indicator = list(censored = 0, event = 1), suffix = ".join")

Arguments

long

a data frame for the longitudinal data, one row per measurment, with subject identification, time of measurement, and longitudinal measurements, etc.

surv

a data frame for the survival data, one row per subject, with subject identification (column name should match that in long), possibly censored time-to-event, and event indicator (normally 0=censored, 1=event), etc.

id.col

a character string specifying the subject identification column in both long and surv.

long.time.col

a character string specifying the time of measurement column in long.

surv.time.col

a character string specifying the possibly censored time-to-event column in surv.

surv.event.col

a character string specifying the event status column in surv.

surv.event.indicator

a list specifying the values in column surv.event.col corresponding to censored and event status.

suffix

a optional character string specifying the suffix to be added to the start, stop, event columns in case long or surv already have columns with these names.

Value

A data frame merging long and surv by subject identification, with one row per longitudinal measurment, and generate three new columns: start, stop, event (column names are added with suffix specified by suffix:

start

starting time of the interval which contains the time of the longitudinal measurements.

stop

ending time of the interval which contains the time of the longitudinal measurements.

event

event indicator suggesting whether the event-of-interest, e.g. death, happens in the interval given by start and stop.

Note

1. If long and surv have columns sharing the same column names, the columns from long and surv would be named with suffixes ".long" and ".surv", respectively, in the output data frame. 2. The time of measurement of the longitudinal measurements and possibly censored time-to-event should be recorded consistently for each subject, i.e. time 0 means the same time point for the longitudinal and survival measurements.

Author(s)

Cong Xu helenxu1112@gmail.com

Examples

1
2
3
4
## Not run: 
liver.join <- dataPreprocess(liver.long, liver.surv, 'ID', 'obstime', 'Time', 'death')

## End(Not run)

JSM documentation built on Sept. 4, 2020, 1:08 a.m.

Related to dataPreprocess in JSM...