knitr::opts_chunk$set( collapse = TRUE, comment = "#>", warning = FALSE, message = FALSE, fig.width = 6, fig.height = 5 )
In biomedical studies, the Cox proportional hazard model (Cox 1972) is the most commonly used model for the regression analysis of survival outcomes. Let's consider the standard time-to-event(e.g death) data $(t_i,\mathbf{x_i},\delta_i) $ where $t_i$ is the observed time,$\mathbf{x_i}$ is the covariate vector and $\delta_i$ is the censoring indicator. Censoring is a type of missing data for event time which means the observed time may not be the same as the actual event time. There are different reasons for censoring occurence such as the lost of follow-up for the subject ,the termination of study before the event happens and so on. The Cox model gives the expression of hazard function of a subject at time t in terms of a baseline hazard $\lambda_0 (t)$ and an exponential function of $\mathbf{x_i}$:
$$ \lambda (t \mid x) = \lambda_{0}(t) \exp{H(\mathbf{x_i})} $$ where H(.) is a risk score function that relates covariates and regression coefficients. In the standard Cox model, the risk score is a linear term, i.e., $ H(\mathbf{x_i})=\mathbf{x_i}^{T}\beta$ . In particular, Xsurv_sim_data can generate data for linear model.
library(Xsurv) set.seed(981) beta=c(0.1,0.5,0.2,1,0.01) #set beta value sim_dat<-Xsurv_sim_data(size=100,dim=5,lambda=2,vu=1, beta=beta, c_rate=0.25)
X<-sim_dat[,1:5] y<-sim_dat[,6:7] xs_cv<-Xsurv.cv(X,y,option = 'lgb',search = 'rd',top_n = 1) #random search tunning parameter with lightgbm model
# both xgboost and lightgbm can be automatically tunning with Xsurv.cv # xs_cv<-Xsurv.cv(X,y,option = 'xgb',search = 'grid') xmod<-xs_cv$model cindx<--xs_cv$cindex xs_cv$SHAP #The only selected feature is V4 (if you remember the simulation setting beta=c(0.1,0.5,0.2,1,0.01), V4 is the only one not close to 0.)
library(survival) data(lung) #Using lung data as an example mydata<-(lung[,-1]) mydata[,2]<-mydata[,2]-1 length(mydata[,1]) names(mydata)<-colnames(mydata) datay_train<-mydata[1:180,c(1,2)] datax_train<-mydata[1:180,-c(1,2)] datay_test<-mydata[181:228,c(1,2)] datax_test<-mydata[181:228,-c(1,2)] #A survival tree plot will be generated. cp here is the pre-specified complexity of tree. xs<-Xsurv.cv(datax_train,datay_train,top_n=5,cp=0.01) xm<-xs$model shap=xs$SHAP shap
library(survminer) risk=xs$risk #plot the kaplan-meier curve for different risk levels risk_data=risk$data fit=survival::survfit(Surv(time,status)~risk,data=risk_data) ggsurvplot(fit,pval = TRUE,palette = c('coral','burlywood1','cadetblue1'),size=2, legend=c(0.85,0.85),legend.title='',font.x=c(15,"plain","black"), font.y=c(18,"plain","black"))
```r pre_x<-Xsurv_predict_sv(xm,datax_train,datay_train,datax_test[10,])
plot(pre_x)
datay_test[10,]
```r
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.