getIPD: Reconstruct individual patient data (IPD) from Scanned...
In IPDfromKM: Map Digitized Survival Curves Back to Individual Patient Data

Description Usage Arguments Value References Examples

View source: R/getIPD.R

After the raw dataset is processed using the preprocess function, we can use the getIPD() function to reconstruct the IPD. Here the total number of events (tot.events) is an optional input; and the treatment arm can be arbitrarily assigned to label the patients' treatment group (Typically, 0 for the control group, and 1 for the treatment group).

The output is the reconstructed IPD in the form of a three-column table (i.e.,time, patient status, and treatment group ID).

In addition, in order to evaluate the accuracy of our reconstruction process, we will calculate the survival probabilities at each read-in time points based on the reconstructed IPD, then compare them with the corresponding read-in survival probabilities. The test statistics are also included in the output.

1	getIPD(prep,armID=1,tot.events=NULL)

`prep`	the class object returned from the `preprocess()` function.
`armID`	the arbitrary lable used as the group indicator for the reconstructed IPD. Typically 0 for the control group and 1 for the treatment group.
`tot.events`	the total number of events. This may not be available for some published curves, thus this input is optional.

getIPD() returns a list object, including four items as follows.

IPD: the estimated individual patient in a three-column table (i.e.time, status, and treatment group indicator).

Points: the data frame shows estimations of parameters at each read-in time points.

riskmat: the data frame shows index of read-in points within each time interval, as well as the estimated numbers of censored patients and events within each time interval.

kstest: the test statistics and p value of Kolmogorov-Smirnov test when comparing the distributions of estimated and read-in K-M curves. The null hypothesis is the read-in and estimated survival probabilities are from the same distribution.

precision: a list shows the root mean squre error(RMSE), mean absolute error and max absolute error which measure the differences between the estimated and read-in survival probabilities.

endpts: the number of patients remaining at the end of trial.

Guyot P, Ades AE, Ouwens MJ, Welton NJ. Enhanced secondary analysis of survival data: reconstructing the data from published Kaplan-Meier survival curves. BMC Med Res Methodol.2012; 1:9.

# Radiationdata$radio is a dataset exported from ScanIt software ================
radio <- Radiationdata$radio

# Load time points when the patients numbers =======
# at risk reported (i.e. trisk in month) =========
trisk <- Radiationdata$trisk

# Load the numbers of patients at risk reported (i.e. nrisk) ======
# at the time points (trisk) ========
nrisk.radio <- Radiationdata$nrisk.radio

# Use the trisk and nrisk as input for preprocess and reconstruction ============
pre_radio_1 <- preprocess(dat=Radiationdata$radio, trisk=trisk,
             nrisk=nrisk.radio,totalpts=NULL,maxy=100)
est_radio_1 <- getIPD(prep=pre_radio_1,armID=0,tot.events=NULL)

# Output include reconstructed individual patients data =========================
head(est_radio_1$IPD)

# When trisk and nrisk were not available, then we must input ====================
# the initial number of patients   ===============================================
pre_radio_2 <- preprocess(dat=Radiationdata$radio, totalpts=213,maxy=100)
est_radio_2 <- getIPD(prep=pre_radio_2,armID=0,tot.events=NULL)

# Output include reconstructed individual patients data ==========================
head(est_radio_2$IPD)