larsDR_coxph: Fitting a LASSO/LARS model on the (Deviance) Residuals

larsDR_coxphR Documentation

Fitting a LASSO/LARS model on the (Deviance) Residuals

Description

This function computes the Cox Model based on lars variables computed model with

  • as the response: the Residuals of a Cox-Model fitted with no covariate

  • as explanatory variables: Xplan.

It uses the package lars to perform PLSR fit.

Usage

larsDR_coxph(Xplan, ...)

## Default S3 method:
larsDR_coxph(
  Xplan,
  time,
  time2,
  event,
  type,
  origin,
  typeres = "deviance",
  collapse,
  weighted,
  scaleX = FALSE,
  scaleY = TRUE,
  plot = FALSE,
  typelars = "lasso",
  normalize = TRUE,
  max.steps,
  use.Gram = TRUE,
  allres = FALSE,
  verbose = TRUE,
  ...
)

## S3 method for class 'formula'
larsDR_coxph(
  Xplan,
  time,
  time2,
  event,
  type,
  origin,
  typeres = "deviance",
  collapse,
  weighted,
  scaleX = FALSE,
  scaleY = TRUE,
  plot = FALSE,
  typelars = "lasso",
  normalize = TRUE,
  max.steps,
  use.Gram = TRUE,
  allres = FALSE,
  dataXplan = NULL,
  subset,
  weights,
  model_frame = FALSE,
  model_matrix = FALSE,
  verbose = TRUE,
  contrasts.arg = NULL,
  ...
)

Arguments

Xplan

a formula or a matrix with the eXplanatory variables (training) dataset

...

Arguments to be passed on to survival::coxph or to lars::lars.

time

for right censored data, this is the follow up time. For interval data, the first argument is the starting time for the interval.

time2

The status indicator, normally 0=alive, 1=dead. Other choices are TRUE/FALSE (TRUE = death) or 1/2 (2=death). For interval censored data, the status indicator is 0=right censored, 1=event at time, 2=left censored, 3=interval censored. Although unusual, the event indicator can be omitted, in which case all subjects are assumed to have an event.

event

ending time of the interval for interval censored or counting process data only. Intervals are assumed to be open on the left and closed on the right, (start, end]. For counting process data, event indicates whether an event occurred at the end of the interval.

type

character string specifying the type of censoring. Possible values are "right", "left", "counting", "interval", or "interval2". The default is "right" or "counting" depending on whether the time2 argument is absent or present, respectively.

origin

for counting process data, the hazard function origin. This option was intended to be used in conjunction with a model containing time dependent strata in order to align the subjects properly when they cross over from one strata to another, but it has rarely proven useful.

typeres

character string indicating the type of residual desired. Possible values are "martingale", "deviance", "score", "schoenfeld", "dfbeta", "dfbetas", and "scaledsch". Only enough of the string to determine a unique match is required.

collapse

vector indicating which rows to collapse (sum) over. In time-dependent models more than one row data can pertain to a single individual. If there were 4 individuals represented by 3, 1, 2 and 4 rows of data respectively, then collapse=c(1,1,1,2,3,3,4,4,4,4) could be used to obtain per subject rather than per observation residuals.

weighted

if TRUE and the model was fit with case weights, then the weighted residuals are returned.

scaleX

Should the Xplan columns be standardized ?

scaleY

Should the time values be standardized ?

plot

Should the survival function be plotted ?)

typelars

One of "lasso", "lar", "forward.stagewise" or "stepwise". The names can be abbreviated to any unique substring. Default is "lasso".

normalize

If TRUE, each variable is standardized to have unit L2 norm, otherwise it is left alone. Default is TRUE.

max.steps

Limit the number of steps taken; the default is 8 * min(m, n-intercept), with m the number of variables, and n the number of samples. For type="lar" or type="stepwise", the maximum number of steps is min(m,n-intercept). For type="lasso" and especially type="forward.stagewise", there can be many more terms, because although no more than min(m,n-intercept) variables can be active during any step, variables are frequently droppped and added as the algorithm proceeds. Although the default usually guarantees that the algorithm has proceeded to the saturated fit, users should check.

use.Gram

When the number m of variables is very large, i.e. larger than N, then you may not want LARS to precompute the Gram matrix. Default is use.Gram=TRUE

allres

FALSE to return only the Cox model and TRUE for additionnal results. See details. Defaults to FALSE.

verbose

Should some details be displayed ?

dataXplan

an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables in the model. If not found in dataXplan, the variables are taken from environment(Xplan), typically the environment from which plscox is called.

subset

an optional vector specifying a subset of observations to be used in the fitting process.

weights

an optional vector of 'prior weights' to be used in the fitting process. Should be NULL or a numeric vector.

model_frame

If TRUE, the model frame is returned.

model_matrix

If TRUE, the model matrix is returned.

contrasts.arg

a list, whose entries are values (numeric matrices, functions or character strings naming functions) to be used as replacement values for the contrasts replacement function and whose names are the names of columns of data containing factors.

Details

This function computes the LASSO/LARS model with the Residuals of a Cox-Model fitted with an intercept as the only explanatory variable as the response and Xplan as explanatory variables. Default behaviour uses the Deviance residuals.

If allres=FALSE returns only the final Cox-model. If allres=TRUE returns a list with the (Deviance) Residuals, the LASSO/LARS model fitted to the (Deviance) Residuals, the eXplanatory variables and the final Cox-model. allres=TRUE is useful for evluating model prediction accuracy on a test sample.

Value

If allres=FALSE :

cox_larsDR

Final Cox-model.

If allres=TRUE :

DR_coxph

The (Deviance) Residuals.

larsDR

The LASSO/LARS model fitted to the (Deviance) Residuals.

X_larsDR

The eXplanatory variables.

cox_larsDR

Final Cox-model.

Author(s)

Frédéric Bertrand
frederic.bertrand@utt.fr
http://www-irma.u-strasbg.fr/~fbertran/

References

plsRcox, Cox-Models in a high dimensional setting in R, Frederic Bertrand, Philippe Bastien, Nicolas Meyer and Myriam Maumy-Bertrand (2014). Proceedings of User2014!, Los Angeles, page 152.

Deviance residuals-based sparse PLS and sparse kernel PLS regression for censored data, Philippe Bastien, Frederic Bertrand, Nicolas Meyer and Myriam Maumy-Bertrand (2015), Bioinformatics, 31(3):397-404, doi:10.1093/bioinformatics/btu660.

See Also

coxph, lars

Examples


data(micro.censure)
data(Xmicro.censure_compl_imp)

X_train_micro <- apply((as.matrix(Xmicro.censure_compl_imp)),FUN="as.numeric",MARGIN=2)[1:80,]
X_train_micro_df <- data.frame(X_train_micro)
Y_train_micro <- micro.censure$survyear[1:80]
C_train_micro <- micro.censure$DC[1:80]

(cox_larsDR_fit <- larsDR_coxph(X_train_micro,Y_train_micro,C_train_micro,max.steps=6,
use.Gram=FALSE,scaleX=TRUE))
(cox_larsDR_fit <- larsDR_coxph(~X_train_micro,Y_train_micro,C_train_micro,max.steps=6,
use.Gram=FALSE,scaleX=TRUE))
(cox_larsDR_fit <- larsDR_coxph(~.,Y_train_micro,C_train_micro,max.steps=6,
use.Gram=FALSE,scaleX=TRUE,dataXplan=X_train_micro_df))

larsDR_coxph(~X_train_micro,Y_train_micro,C_train_micro,max.steps=6,use.Gram=FALSE)
larsDR_coxph(~X_train_micro,Y_train_micro,C_train_micro,max.steps=6,use.Gram=FALSE,scaleX=FALSE)
larsDR_coxph(~X_train_micro,Y_train_micro,C_train_micro,max.steps=6,use.Gram=FALSE,
scaleX=TRUE,allres=TRUE)

rm(X_train_micro,Y_train_micro,C_train_micro,cox_larsDR_fit)


plsRcox documentation built on Dec. 1, 2022, 1:31 a.m.