mtlr_cv: MTLR Internal Cross-Validation for Selecting C1.
In MTLR: Survival Prediction with Multi-Task Logistic Regression

Description Usage Arguments Details Value See Also Examples

MTLR Internal Cross-Validation for Selecting C1.

mtlr_cv(formula, data, time_points = NULL, nintervals = NULL,
  normalize = T, C1_vec = c(0.001, 0.01, 0.1, 1, 10, 100, 1000),
  train_biases = T, train_uncensored = T, seed_weights = NULL,
  previous_weights = T, loss = c("ll", "concordance"), nfolds = 5,
  foldtype = c("fullstrat", "censorstrat", "random"), verbose = FALSE,
  threshold = 1e-05, maxit = 5000, lower = -15, upper = 15)

`formula`	a formula object with the response to the left of the "~" operator. The response must be a survival object returned by the `Surv` function.
`data`	a data.frame containing the features for survival prediction. These must be variables corresponding to the formula object.
`time_points`	the time points for MTLR to create weights. If left as NULL, the time_points chosen will be based on equally spaced quantiles of the survival times. In the case of interval censored data note that only the start time is considered and not the end time for selecting time points. It is strongly recommended to specify time points if your data is heavily interval censored. If time_points is not NULL then nintervals is ignored.
`nintervals`	Number of time intervals to use for MTLR. Note the number of time points will be nintervals + 1. If left as NULL a default of sqrt(N) is used where N is the number of observations in the supplied dataset. This parameter is ignored if time_points is specified.
`normalize`	if TRUE, variables will be normalized (mean 0, standard deviation of 1). This is STRONGLY suggested. If normalization does not occur it is much more likely that MTLR will fail to converge. Additionally, if FALSE consider adjusting "lower" and "upper" used for L-BFGS-B optimization.
`C1_vec`	a vector of regularization parameters to test. All values must be non-negative. For large datasets you may want to reduce the number of value tried to increase efficiency. Similarly for nfolds.
`train_biases`	if TRUE, biases will be trained before feature weights (and again trained while training feature weights). This has shown to speed up total training time.
`train_uncensored`	if TRUE, one round of training will occur assuming all event times are uncensored. This is done due to the non-convexity issue that arises in the presence of censored data. However if ALL data is censored we recommend setting this option to FALSE as it has shown to give poor results in this case.
`seed_weights`	the initialization weights for the biases and the features. If left as NULL all weights are initialized to zero. If seed_weights are specified then either nintervals or time_points must also be specified. The length of seed_weights should correspond to (number of features + 1)(length of time_points) = (number of features + 1)(nintervals + 1).
`previous_weights`	a boolean specifying if sequential folds should use the previous fold's parameters as seed_weights. Doing this will likely speed up the computation time for cross-validation as we are providing weights which are (likely) close to the optimal weights. Note that this is done separately for each value of C1 so there is no parameter sharing between different values of C1, and instead only across the same value of C1.
`loss`	a string indicating the loss to optimize for which to choose the regularization parameter. Currently one can optimize for the log-likelihood ("ll") or concordance ("concordance"). See details regarding these losses.
`nfolds`	the number of internal cross validation folds, default is 5.
`foldtype`	type of cross validation folds. Full stratification, "fullstrat", sorts observations by their event time and their event indicators and numbers them off into folds. This effectively give each fold approximately the same number of uncensored observations as well as keeps the range of time points as equivalent as possible across folds. This type of cross-validation is completely deterministic. Censored stratification, "censorstrat", will put approximately the same number of uncensored observations in each fold but not pay any attention to event time. This is partially stochastic. The totally random cross-validation, "random", randomly assigns observations to folds without considering event time nor event status.
`verbose`	if TRUE the progress will be printed for every completed value of C1.
`threshold`	The threshold for the convergence tolerance (in the objective function) when training the feature weights. This threshold will be passed to optim.
`maxit`	The maximum iterations to run for MTLR. This parameter will be passed to optim.
`lower`	The lower bound for L-BFGS-B optimization. This parameter will be passed to optim.
`upper`	The upper bound for L-BFGS-B optimization. This parameter will be passed to optim.

The log-likelihood loss and concordance are supported for optimizing C1. Here the log-likelihood loss considers censored and uncensored observations differently. For uncensored observations, we assign a loss of the negative log probability assigned to the interval in which the observation had their event, e.g. if an observation had a 20 is -log(0.2). We want these probabilities to be large so we would normally want to maximize this value (since logs of probabilities are negative) but we take the negative and instead minimize the value, thus we want the lowest loss. For censored observations we take the log of the probability of survival at the time of censoring, e.g. if an observation is censored at time = 42 we take the negative log of the survival probability assigned to time 42 as the loss.

For the concordance loss, C1 is chosen to maximize the overall concordance when using the negative median as the "risk" score. This is completed using survConcordance in the survival package.

Performing mtlr_cv will return the following:

best_C1: The value of C1 which achieved the best (lowest) loss.
avg_loss: The averaged value of loss across the five folds for each value of C1 tested.

mtlr

library(survival)
cv_mod <- mtlr_cv(Surv(time,status)~., data = lung)
#Note the best C1 also corresponds to the lost average loss:
cv_mod

$best_C1
[1] 1

$avg_loss
   0.001     0.01      0.1        1       10      100     1000 
2.600233 2.326139 2.173730 2.108337 2.120434 2.142821 2.147808