timedivision: Convert Multiple Rows Arranged Time-Series Data into...

View source: R/timedivision.R

timedivisionR Documentation

Convert Multiple Rows Arranged Time-Series Data into Time-Slices Data

Description

Data preprocessing process essential for building survival path model. For each subject with observations at different time point, screen out specific observations at each specific time slice by setting associated parameters, includes period, left_interval and right_interval.

Usage

timedivision(dataset,
ID,
time,
period=30,
left_interval = 0.5,
right_interval = 0.5
)

Arguments

dataset

A multiple rows arranged time-series dataset, containing identification numbers, follow-up time points, risk factors, survival time, and survival status.

ID

Character string, representingID corresponding to each row of data in the dataset, which should be unique for each subject.

time

Date format, which indicates time point of each observation.

period

Numeric, utilized to customize follow-up sampling period;normally counting in days.

left_interval

Numeric, preferentially fall into the interval of (0,1). For a specific sampling in time slice T, the earliest sampling in the time interval [ left_interval*period, right_interval*period] is considered as the sampling data of the specific time slice T.

right_interval

same as above.

Details

This function is used to facilitate automatic generation of time-slice data. The date of observations for each subject should be arranged in ascending order. The researchers can skip this process if they intend to prepare time-slice data manually or using customized codes. It's important to note that this function only support data sampling of the "earliest" observation of interval in each time slice. If no observation fall into the interval of time slice T, then sampling of observation in time slice T+1 for that subject will be terminated.

Value

data.frame;observations of different time slices for each ID.The new data.frame returned added a new column "time_slice", which indicates the time slice of each observation included.

Author(s)

Lujun Shen and Tao Zhang

Examples

library(dplyr)
data("DTSDHCC")
id = DTSDHCC$ID[!duplicated(DTSDHCC$ID)]
set.seed(123)
id = sample(id,500)
miniDTSDHCC <- DTSDHCC[DTSDHCC$ID %in% id,]
dataset = timedivision(miniDTSDHCC,"ID","Date",period = 90,left_interval = 0.5,right_interval=0.5)
resu <- generatorDTSD(dataset,periodindex="time_slice",IDindex="ID" ,timeindex="OStime_day",
 statusindex="Status_of_death",variable =c( "Age", "Amount.of.Hepatic.Lesions",
 "Largest.Diameter.of.Hepatic.Lesions",
 "New.Lesion","Vascular.Invasion" ,"Local.Lymph.Node.Metastasis",
 "Distant.Metastasis" , "Child_pugh_score" ,"AFP"),predict.time=365*1)


SurvivalPath documentation built on July 4, 2022, 1:05 a.m.