Home

/

CRAN

/

CoxBoost

/

iCoxBoost: Interface for cross-validation and model fitting using a...

iCoxBoost: Interface for cross-validation and model fitting using a...
In CoxBoost: Cox models by likelihood based boosting for a single survival endpoint or competing risks

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/iCoxBoost.R

Formula interface for fitting a Cox proportional hazards model by componentwise likelihood based boosting (via a call to CoxBoost), where cross-validation can be performed automatically for determining the number of boosting steps (via a call to cv.CoxBoost).

iCoxBoost(formula,data=NULL,weights=NULL,subset=NULL,mandatory=NULL,
		  cause=1,standardize=TRUE,stepno=200,
		  criterion=c("pscore","score","hpscore","hscore"),
		  nu=0.1,stepsize.factor=1,varlink=NULL,
		  cv=cvcb.control(),trace=FALSE,...)

`formula`	A formula describing the model to be fitted, similar to a call to `coxph`. The response must be a survival object, either as returned by `Surv` or `Hist` (in a competing risks application).
`data`	data frame containing the variables described in the formula.
`weights`	optional vector, for specifying weights for the individual observations.
`subset`	a vector specifying a subset of observations to be used in the fitting process.
`mandatory`	vector containing the names of the covariates whose effect is to be estimated un-regularized.
`cause`	cause of interest in a competing risks setting, when the response is specified by `Hist` (see e.g. Fine and Gray, 1999; Binder et al. 2009a).
`standardize`	logical value indicating whether covariates should be standardized for estimation. This does not apply for mandatory covariates, i.e., these are not standardized.
`stepno`	maximum number of boosting steps to be evaluated when determining the number of boosting steps by cross-validation, otherwise the number of boosting seps itself.
`criterion`	indicates the criterion to be used for selection in each boosting step. `"pscore"` corresponds to the penalized score statistics, `"score"` to the un-penalized score statistics. Different results will only be seen for un-standardized covariates (`"pscore"` will result in preferential selection of covariates with larger covariance), or if different penalties are used for different covariates. `"hpscore"` and `"hscore"` correspond to `"pscore"` and `"score"`. However, a heuristic is used for evaluating only a subset of covariates in each boosting step, as described in Binder et al. (2011). This can considerably speed up computation, but may lead to different results.
`nu`	(roughly) the fraction of the partial maximum likelihood estimate used for the update in each boosting step. This is converted into a penalty for the call to `CoxBoost`. Use smaller values, e.g., 0.01 when there is little information in the data, and larger values, such as 0.1, with much information or when the number of events is larger than the number of covariates. Note that the default for direct calls to `CoxBoost` corresponds to `nu=0.1`.
`stepsize.factor`	determines the step-size modification factor by which the natural step size of boosting steps should be changed after a covariate has been selected in a boosting step. The default (value `1`) implies constant `nu`, for a value < 1 the value `nu` for a covariate is decreased after it has been selected in a boosting step, and for a value > 1 the value `nu` is increased. If `pendistmat` is given, updates of `nu` are only performed for covariates that have at least one connection to another covariate.
`varlink`	list for specifying links between covariates, used to re-distribute step sizes when `stepsize.factor != 1`. The list needs to contain at least two vectors, the first containing the name of the source covariates, the second containing the names of the corresponding target covariates, and a third (optional) vector containing weights between 0 and 1 (defaulting to 1). If `nu` is increased/descreased for one of the source covariates according to `stepsize.factor`, the `nu` for the corresponding target covariate is descreased/increased accordingly (multiplied by the weight). If `formula` contains interaction terms, als rules for these can be set up, using variable names such as `V1:V2` for the interaction term between covariates `V1` and `V2`.
`cv`	`TRUE`, for performing cross-validation, with default parameters, `FALSE` for not performing cross-validation, or list containing the parameters for cross-validation, as obtained from a call to `cvcb.control`.
`trace`	logical value indicating whether progress in estimation should be indicated by printing the name of the covariate updated.
`...`	miscellaneous arguments, passed to the call to `cv.CoxBoost`.

In contrast to gradient boosting (implemented e.g. in the glmboost routine in the R package mboost, using the CoxPH loss function), CoxBoost is not based on gradients of loss functions, but adapts the offset-based boosting approach from Tutz and Binder (2007) for estimating Cox proportional hazards models. In each boosting step the previous boosting steps are incorporated as an offset in penalized partial likelihood estimation, which is employed for obtain an update for one single parameter, i.e., one covariate, in every boosting step. This results in sparse fits similar to Lasso-like approaches, with many estimated coefficients being zero. The main model complexity parameter, the number of boosting steps, is automatically selected by cross-validation using a call to cv.CoxBoost). Note that this will introduce random variation when repeatedly calling iCoxBoost, i.e. it is advised to set/save the random number generator state for reproducible results.

The advantage of the offset-based approach compared to gradient boosting is that the penalty structure is very flexible. In the present implementation this is used for allowing for unpenalized mandatory covariates, which receive a very fast coefficient build-up in the course of the boosting steps, while the other (optional) covariates are subjected to penalization. For example in a microarray setting, the (many) microarray features would be taken to be optional covariates, and the (few) potential clinical covariates would be taken to be mandatory, by including their names in mandatory.

If a group of correlated covariates has influence on the response, e.g. genes from the same pathway, componentwise boosting will often result in a non-zero estimate for only one member of this group. To avoid this, information on the connection between covariates can be provided in varlink. If then, in addition, a penalty updating scheme with stepsize.factor < 1 is chosen, connected covariates are more likely to be chosen in future boosting steps, if a directly connected covariate has been chosen in an earlier boosting step (see Binder and Schumacher, 2009b).

iCoxBoost returns an object of class iCoxBoost, which also has class CoxBoost. In addition to the elements from CoxBoost it has the following elements:

`call, formula, terms`	call, formula and terms from the formula interface.
`cause`	cause of interest.
`cv.res`	result from `cv.CoxBoost`, if cross-validation has been performed.

Written by Harald Binder binderh@uni-mainz.de.

Binder, H., Benner, A., Bullinger, L., and Schumacher, M. (2013). Tailoring sparse multivariable regression techniques for prognostic single-nucleotide polymorphism signatures. Statistics in Medicine, doi: 10.1002/sim.5490.

Binder, H., Allignol, A., Schumacher, M., and Beyersmann, J. (2009). Boosting for high-dimensional time-to-event data with competing risks. Bioinformatics, 25:890-896.

Binder, H. and Schumacher, M. (2009). Incorporating pathway information into boosting estimation of high-dimensional risk prediction models. BMC Bioinformatics. 10:18.

Binder, H. and Schumacher, M. (2008). Allowing for mandatory covariates in boosting estimation of sparse high-dimensional survival models. BMC Bioinformatics. 9:14.

Tutz, G. and Binder, H. (2007) Boosting ridge regression. Computational Statistics \& Data Analysis, 51(12):6044-6059.

Fine, J. P. and Gray, R. J. (1999). A proportional hazards model for the subdistribution of a competing risk. Journal of the American Statistical Association. 94:496-509.

predict.iCoxBoost, CoxBoost, cv.CoxBoost.

#   Generate some survival data with 10 informative covariates 
n <- 200; p <- 100
beta <- c(rep(1,2),rep(0,p-2))
x <- matrix(rnorm(n*p),n,p)
actual.data <- as.data.frame(x)
real.time <- -(log(runif(n)))/(10*exp(drop(x %*% beta)))
cens.time <- rexp(n,rate=1/10)
actual.data$status <- ifelse(real.time <= cens.time,1,0)
actual.data$time <- ifelse(real.time <= cens.time,real.time,cens.time)

#   Fit a Cox proportional hazards model by iCoxBoost

cbfit <- iCoxBoost(Surv(time,status) ~ .,data=actual.data) 
summary(cbfit)
plot(cbfit)

#   ... with covariates 1 and 2 being mandatory

cbfit.mand <- iCoxBoost(Surv(time,status) ~ .,data=actual.data,mandatory=c("V1")) 
summary(cbfit.mand)
plot(cbfit.mand)

CoxBoost documentation built on May 1, 2019, 9:32 p.m.

CoxBoost index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

CoxBoost
Cox models by likelihood based boosting for a single survival endpoint or competing risks

iCoxBoost: Interface for cross-validation and model fitting using a...
In CoxBoost: Cox models by likelihood based boosting for a single survival endpoint or competing risks

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Related to iCoxBoost in CoxBoost...

R Package Documentation

Browse R Packages

We want your feedback!

CoxBoost Cox models by likelihood based boosting for a single survival endpoint or competing risks

iCoxBoost: Interface for cross-validation and model fitting using a... In CoxBoost: Cox models by likelihood based boosting for a single survival endpoint or competing risks

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Related to iCoxBoost in CoxBoost...

R Package Documentation

Browse R Packages

We want your feedback!

CoxBoost
Cox models by likelihood based boosting for a single survival endpoint or competing risks

iCoxBoost: Interface for cross-validation and model fitting using a...
In CoxBoost: Cox models by likelihood based boosting for a single survival endpoint or competing risks