sl.time: Super Learner for Censored Outcomes
In RISCA: Causal Inference and Prediction in Cohort-Based Analyses

sl.time

R Documentation

Super Learner for Censored Outcomes

Description

This function allows to compute a Super Learner (SL) to predict survival outcomes.

Usage

sl.time(methods, metric, data, times, failures, group, cov.quanti, cov.quali, cv, 
param.tune, pro.time, optim.local.min, ROC.precision, param.weights.fix,
 param.weights.init, keep.predictions, verbose)

Arguments

`methods`	A vector of characters with the names of the algorithms included in the SL. At least two algorithms have to be included.
`metric`	The loss function used to estimate the weights of the algorithms in the SL. See details.
`data`	A data frame in which to look for the variables related to the status of the follow-up time (`times`), the event (`failures`), the optional treatment/exposure (`group`) and the covariables included in the previous model (`cov.quanti` and `cov.quali`).
`times`	The name of the variable related the numeric vector with the follow-up times.
`failures`	The name of the variable related the numeric vector with the event indicators (0=right censored, 1=event).
`group`	The name of the variable related to the exposure/treatment. This variable shall have only two modalities encoded 0 for the untreated/unexposed patients and 1 for the treated/exposed ones. The default value is NULL: no specific exposure/treatment is considered. When a specific exposure/treatment is considered, it will be forced in the algorithm or related interactions will be tested when possible.
`cov.quanti`	The name(s) of the variable(s) related to the possible quantitative covariates. These variables must be numeric.
`cov.quali`	The name(s) of the variable(s) related to the possible qualitative covariates. These variables must be numeric with two levels: 0 and 1. A complete disjunctive form must be used for covariates with more levels.
`cv`	The number of splits for cross-validation. The default value is 10.
`param.tune`	A list with a length equals to the number of algorithms included in `methods`. If `NULL`, the tunning parameters are estimated (see details).
`pro.time`	This optional value of prognostic time represents the maximum delay for which the capacity of the variable is evaluated. The same unit than the one used in the argument times. Not used for the following metrics: "loglik", "ibs", "bll", and "ibll". Default value is the time at which half of the subjects are still at risk.
`optim.local.min`	An optional logical value. If `TRUE`, the optimization is performed twice to better ensure the estimation of the weights. If `FALSE` (default value), the optimization is performed once.
`ROC.precision`	The percentiles (between 0 and 1) of the prognostic variable used for computing each point of the time dependent ROC curve. Only used when `metric="auc"`. 0 (min) and 1 (max) are not allowed. By default, the precision is `seq(.01,.99,.01)`.
`param.weights.fix`	A vector with the parameters of the multinomial logistic regession which generates the weigths of the algorithms declared in `methods`. When completed, the related parameters are not estimated. The default value is NULL: the parameters are estimated by a `cv`-fold cross-validation. See details.
`param.weights.init`	A vector with the initial values of the parameters of the multinomial logistic regession which generates the weigths of the algorithms declared in `methods`. The default value is NULL: the initial values are equaled to 0. See details.
`keep.predictions`	A logical value specifying if all the predictions for all the `methods` are saved. If `FALSE`, only the predictions related to the SL are saved (for space saving). The default is `TRUE`.
`verbose`	A logical value specifying if SuperLearner indicates whether to print progress (`TRUE`) in the fitting process to the console. The default is `TRUE`

Details

Each object of the list declared in param.tune must have the same name than the names of the methods included in the SL. If param.tune = NULL, the tunning parameters of each algorithm are estimated by cv-fold cross-validation. Otherwise, the user can propose a tunning grid for each method, as explained in the following table. The following metrics can be used: "brier" for the Brier score at the prognostic time pro.time, "loglik" for the Log-likelihood, "ibs" for the Integrated Brier score up to the last observed time of event, "ibll" for the Integrated Binomial Log-likelihood up to the last observed time of event, "bll" for the binomial Log-likelihood, "ribs" for the restricted Integrated Brier score up to the prognostic time pro.time, "ribll" for the restricted Integrated Binomial Log-likelihood Log-likelihood up to the last observed time of event, "bll" for the binomial Log-likelihood, "auc" for the area under the time-dependent ROC curve up to the prognostic time pro.time.

Methods:

Names	Description	Package	assumption
`"aft.gamma"`	Gamma	flexsurv	AFT
`"aft.ggamma"`	Generalized Gamma	flexsurv	AFT
`"aft.weibull"`	Weibull	flexsurv	AFT
`"ph.exponential"`	Exponential	flexsurv	PH
`"ph.gompertz"`	Gompertz	flexsurv	PH
`"cox.en"`	Elastic Net Cox	glmnet	PH
`"cox.lasso"`	Lasso Cox	glmnet	PH
`"cox.ridge"`	Ridge Cox	glmnet	PH
`"rf.time"`	Survival Random Forest	randomForestSRC	RF
`"nn.time"`	Neural Network	survivalmodels	PH

Loss Function metric:

Brier Score ("bs")
Binomial log likelihood ("bll")
Integrated brier score ("ibs")
Integrated binomial log likelihood ("ibll")
Restricted Integrated Brier Score ("ribs")
Restricted Integrated Binomial Log-Likelihood ("ribll")

Value

`times`	A vector of numeric values with the times of the `predictions`.
`predictions`	A list of matrices with the predictions of survivals of each subject (lines) for each observed times (columns). Each matrix corresponds to the included `methods` and the resulted SL (the last item entitled "sl"). If `keep.predictions=TRUE`, it corresponds to a matrix with predictions related to the SL.
`data`	The data frame used for learning. The first column is entitled `times` and corresponds to the observed follow-up times. The second column is entitled `failures` and corresponds to the event indicators. The other columns correspond to the predictors.
`predictors`	A list with the predictors involved in `group`, `cov.quanti` and `cov.quali`.
`ROC.precision`	The percentiles (between 0 and 1) of the prognostic variable used for computing each point of the time dependent ROC curve.
`cv`	The number of splits for cross-validation.
`pro.time`	The maximum delay for which the capacity of the variable is evaluated.
`models`	A list with the estimated models/algorithms included in the SL.
`weights`	A list composed by two vectors: the regressions `coefficients` of the logistic multinomial regression and the resulting weights' `values`
`metric`	A list composed by two vectors: the loss function used to estimate the weights of the algorithms in the SL and its value.
`param.tune`	The estimated tunning parameters.

Author(s)

Yohann Foucher <Yohann.Foucher@univ-poitiers.fr>

Camille Sabathe <camille.sabathe@univ-nantes.fr>

References

Polley E and van der Laanet M. Super Learner In Prediction. http://biostats.bepress.com/ucbbiostat/paper266. 2010.

Sabathe C and Foucher Y. Super Learner for survival prediction from censored data: Extension of the R package RISCA. Manuscript submitted. 2022.

Examples


data(dataDIVAT2)

#The outcome model base on a Super Learner and the first 150 individuals of the data base
sl1<-sl.time( methods=c("aft.gamma", "ph.gompertz"),  metric="ibs",
  data=dataDIVAT2[1:150,],  times="times", failures="failures", group="ecd",
  cov.quanti=c("age"),  cov.quali=c("hla", "retransplant"), cv=3)
  
# Individual prediction
pred <- predict(sl1, newdata=data.frame(age=c(52,52), hla=c(0,1), retransplant=c(1,1), ecd=c(0,1)))

plot(y=pred$predictions$sl[1,], x=pred$times, xlab="Time (years)", ylab="Predicted survival",
     col=1, type="l", lty=1, lwd=2, ylim=c(0,1))

lines(y=pred$predictions$sl[2,], x=pred$times, col=2, type="l", lty=1, lwd=2)

legend("topright", col=c(1,2), lty=1, lwd=2, c("Subject #1", "Subject #2"))

RISCA documentation built on March 31, 2023, 11:06 p.m.