pcoxtimecv | R Documentation |
Performs k
-fold cross-validation for pcoxtime, plots
solution path plots, and returns optimal value of lambda
(and optimal alpha if more than one is given).
pcoxtimecv( formula, data, alphas = 1, lambdas = NULL, nlambdas = 100, lammin_fract = NULL, lamfract = 0.6, nfolds = 10, foldids = NULL, devtype = "vv", refit = FALSE, maxiter = 1e+05, tol = 1e-08, quietly = FALSE, seed = NULL, nclusters = 1, na.action = na.omit, ... )
formula |
object of class formula describing
the model. The response is specified similar to
|
data |
optional data frame containing variables specified in the formula. |
alphas |
elasticnet mixing parameter, with
|
lambdas |
optional user-supplied sequence. If |
nlambdas |
the default number of lambdas values. Default is |
lammin_fract |
smallest value of |
lamfract |
proportion of regularization path to consider. If |
nfolds |
number of folds. Default is |
foldids |
an optional sequence of values between |
devtype |
loss to use for cross-validation. Currently, two options are available but versions will implement |
refit |
logical. Whether to return solution path based on optimal lambda and alpha picked by the model. Default is |
maxiter |
maximum number of iterations to convergence. Default is 1e5. Consider increasing it if the model does not converge. |
tol |
convergence threshold for proximal gradient gradient descent. Each proximal update continues until the relative change in all the coefficients (i.e. √{∑(β_{k+1} - β_k)^2}/stepsize) is less than tol. The default value is 1e-8. |
quietly |
logical. If TRUE, refit progress is printed. |
seed |
random seed. Default is |
nclusters |
number of cores to use to run the cross-validation in parallel. Default is |
na.action |
a function which indicates what should happen when the data contain NAs. |
... |
additional arguments not implemented. |
The function fits pcoxtime
folds + 1
(if refit = FALSE
) or folds + 2
times (if refit = FALSE
). In the former case, the solution path to display using plot.pcoxtimecv
is randomly picked from all the cross-validation runs. However, in the later case, the solution path plot is based on the model refitted using the optimal parameters. In both cases, the function first runs plot.pcoxtimecv
to compute the lambda sequence and then perform cross-validation on nfolds
.
If more than one alphas
is specified, say code(0.2, 0.5, 1), the pcoxtimecv
will search (experimental) for optimal values for alpha with respect to the corresponding lambda values. In this case, optimal alpha and lambda sequence will be returned, i.e., the (alphas, lambdas)
pair that corresponds to the lowest predicted cross-validated error (likelihood deviance).
For data sets with a very large number of predictors, it is recommended to only calculate partial paths by lowering the value of lamfract
. In other words, for p > n
problems, the near lambda = 0
solution is poorly behaved and this may account for over 99%
of the function's runtime. We therefore recommend always specifying lamfract < 1
and increase if the optimal lambda suggests lower values.
An S3 object of class pcoxtimecv
:
lambda.min |
the value of lambda that gives minimum cross-validated error. |
lambda.1se |
largest value of lambda such that error is within |
alpha.optimal |
optimal alpha corresponding to |
lambdas.optimal |
the sequence of lambdas containing |
foldids |
the fold assignment used. |
dfs |
list of data frames containing mean cross-validated error summaries and estimated coefficients in each fold. |
fit |
if |
.
Dai, B., and Breheny, P. (2019). Cross validation approaches for penalized Cox regression. arXiv preprint arXiv:1905.10432.
Simon, N., Friedman, J., Hastie, T., Tibshirani, R. (2011) Regularization Paths for Cox's Proportional Hazards Model via Coordinate Descent, Journal of Statistical Software, Vol. 39(5) 1-13 doi: 10.18637/jss.v039.i05.
plot.pcoxtimecv
, pcoxtime
# Time-independent covariates if (packageVersion("survival")>="3.2.9") { data(cancer, package="survival") } else { data(veteran, package="survival") } cv1 <- pcoxtimecv(Surv(time, status) ~ factor(trt) + karno + diagtime + age + prior , data = veteran , alphas = 1 , refit = FALSE , lamfract = 0.6 ) print(cv1) # Train model using optimal alpha and lambda fit1 <- pcoxtime(Surv(time, status) ~ factor(trt) + karno + diagtime + age + prior , data = veteran , alpha = cv1$alpha.optimal , lambda = cv1$lambda.min ) print(fit1) # Time-varying covariates data(heart, package="survival") cv2 <- pcoxtimecv(Surv(start, stop, event) ~ age + year + surgery + transplant , data = heart , alphas = 1 , refit = FALSE , lamfract = 0.6 ) print(cv2) # Train model fit2 <- pcoxtime(Surv(start, stop, event) ~ age + year + surgery + transplant , data = heart , alpha = cv2$alpha.optimal , lambda = cv2$lambda.min ) print(fit2)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.