knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
CondiS is an R package that imputes survival time for censored observations. It allows the direct application of standard machine learning techniques for regression modeling once the imputed survival time is obtained. This vignette shows the use of CondiS package and introduce the things CondiS can do for you. CondiS was created by Yizhuo Wang, Xuelin Huang, Ziyi Li and Christopher R. Flowers, and is now maintained by Yizhuo Wang.
Install CondiS using the code below to to ensure that all the needed packages are installed.
# install.packages("CondiS", dependencies = c("survival", "caret")) library(CondiS)
CondiS has two functions to help impute the survival times as much alike as true survival times for the censored observations. A built-in R dataset in the survival package, rotterdam, is used here to demonstrate the usages of these two functions.
The imputed survival times for censored observations are generated based on their conditional survival distributions derived from the Kaplan-Meier estimator. Below are the input parameters of the CondiS function:
library(kernlab) library(purrr) library(tidyverse) library(survival) data(cancer, package="survival") status <- pmax(rotterdam$recur, rotterdam$death) rfstime <- with(rotterdam, ifelse(recur==1, rtime, dtime)) rotterdam <- rotterdam[2:11] rotterdam$status = status rotterdam$rfstime = rfstime fit <- survfit(Surv(rfstime, status) ~ 1, data = rotterdam) # Obtain the imputed survival time pred_time = CondiS(rfstime, status) rotterdam$pred_time = pred_time rotterdam$status2 = rep(1,length(status)) fit_2 <- survfit(Surv(pred_time, status2) ~ 1, data = rotterdam) # Visualization library(survminer) combined <- list(Censored = fit, CondiS = fit_2) ggsurvplot( combined, data = rotterdam, combine = TRUE, censor = TRUE, risk.table = TRUE, palette = "jco" )
The imputed survival times are further improved by incorporating the covariate information through machine learning modeling (CondiS-X). Below are the input parameters of the CondiS-X function:
covariates = rotterdam[,1:10] # Update the imputed survival time pred_time_2 = CondiS_X(pred_time, status, covariates) rotterdam$pred_time_2 = pred_time_2
# Pre-process the data library(caret) preproc <- preProcess(rotterdam[,1:10], method = c('center', 'scale')) trainPreProc <- predict(preproc, rotterdam[,1:10]) train_control <- trainControl(method = "repeatedcv") # Train-test split set.seed(42) smp_size <- floor(0.75 * nrow(rotterdam)) train_ind <- sample(seq_len(nrow(rotterdam)), size = smp_size) train <- rotterdam[train_ind, ] test <- rotterdam[-train_ind, ] fit_svm = train( pred_time ~ .-status-status2-rfstime-pred_time, data = train, method = "svmRadial", trControl = train_control, na.action = na.omit ) pred_svm = predict(fit_svm, test) # Mean absolute error (MAE) calc_MAE <- function(actual,predicted) { error <- actual - predicted mean(abs(error)) } ## In the testing set: # The MAE of CondiS-imputed survival time and SVM-predicted survival time is: calc_MAE(test$pred_time,pred_svm) # The MAE of the CondiS-X-imputed survival time and the SVM-predicted survival time is: calc_MAE(test$pred_time_2,pred_svm)
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.