Description Usage Arguments Details Value Author(s) References See Also Examples
CoxBoost
is used to fit a Cox proportional hazards model by componentwise likelihood based boosting.
It is especially suited for models with a large number of predictors and allows for mandatory covariates with unpenalized parameter estimates.
1 2 3 4 5  CoxBoost(time,status,x,unpen.index=NULL,standardize=TRUE,subset=1:length(time),
weights=NULL,stepno=100,penalty=9*sum(status[subset]==1),
criterion = c("pscore", "score","hpscore","hscore"),
stepsize.factor=1,sf.scheme=c("sigmoid","linear"),pendistmat=NULL,
connected.index=NULL,x.is.01=FALSE,return.score=TRUE,trace=FALSE)

time 
vector of length 
status 
censoring indicator, i.e., vector of length 
x 

unpen.index 
vector of length 
standardize 
logical value indicating whether covariates should be standardized for estimation. This does not apply for mandatory covariates, i.e., these are not standardized. 
subset 
a vector specifying a subset of observations to be used in the fitting process. 
weights 
optional vector of length 
penalty 
penalty value for the update of an individual element of the parameter vector in each boosting step. 
criterion 
indicates the criterion to be used for selection in each boosting step. 
stepsize.factor 
determines the stepsize modification factor by which the natural step size of boosting steps should be changed after a covariate has been selected in a boosting step. The default (value 
sf.scheme 
scheme for changing step sizes (via 
pendistmat 
connection matrix with entries ranging between 0 and 1, with entry 
connected.index 
indices of the 
stepno 
number of boosting steps ( 
x.is.01 
logical value indicating whether (the nonmandatory part of) 
return.score 
logical value indicating whether the value of the score statistic (or penalized score statistic, depending on 
trace 
logical value indicating whether progress in estimation should be indicated by printing the name of the covariate updated. 
In contrast to gradient boosting (implemented e.g. in the glmboost
routine in the R package mboost
, using the CoxPH
loss function), CoxBoost
is not based on gradients of loss functions, but adapts the offsetbased boosting approach from Tutz and Binder (2007) for estimating Cox proportional hazards models. In each boosting step the previous boosting steps are incorporated as an offset in penalized partial likelihood estimation, which is employed for obtain an update for one single parameter, i.e., one covariate, in every boosting step. This results in sparse fits similar to Lassolike approaches, with many estimated coefficients being zero. The main model complexity parameter, which has to be selected (e.g. by crossvalidation using cv.CoxBoost
), is the number of boosting steps stepno
. The penalty parameter penalty
can be chosen rather coarsely, either by hand or using optimCoxBoostPenalty
.
The advantage of the offsetbased approach compared to gradient boosting is that the penalty structure is very flexible. In the present implementation this is used for allowing for unpenalized mandatory covariates, which receive a very fast coefficient buildup in the course of the boosting steps, while the other (optional) covariates are subjected to penalization.
For example in a microarray setting, the (many) microarray features would be taken to be optional covariates, and the (few) potential clinical covariates would be taken to be mandatory, by including their indices in unpen.index
.
If a group of correlated covariates has influence on the response, e.g. genes from the same pathway, componentwise boosting will often result in a nonzero estimate for only one member of this group. To avoid this, information on the connection between covariates can be provided in pendistmat
. If then, in addition, a penalty updating scheme with stepsize.factor
< 1 is chosen, connected covariates are more likely to be chosen in future boosting steps, if a directly connected covariate has been chosen in an earlier boosting step (see Binder and Schumacher, 2009b).
CoxBoost
returns an object of class CoxBoost
.
n, p 
number of observations and number of covariates. 
stepno 
number of boosting steps. 
xnames 
vector of length 
are used.
coefficients 

.
scoremat 

meanx, sdx 
vector of mean values and standard deviations used for standardizing the covariates. 
unpen.index 
indices of the mandatory covariates in the original covariate matrix 
penalty 
If 
time 
observed times given in the 
status 
censoring indicator given in the 
event.times 
vector with event times from the data given in the 
linear.predictors 

Lambda 
matrix with the Breslow estimate for the cumulative baseline hazard for boosting steps 
logplik 
partial loglikelihood of the fitted model in the final boosting step. 
Written by Harald Binder binderh@unimainz.de.
Binder, H., Benner, A., Bullinger, L., and Schumacher, M. (2013). Tailoring sparse multivariable regression techniques for prognostic singlenucleotide polymorphism signatures. Statistics in Medicine, doi: 10.1002/sim.5490.
Binder, H., Allignol, A., Schumacher, M., and Beyersmann, J. (2009). Boosting for highdimensional timetoevent data with competing risks. Bioinformatics, 25:890896.
Binder, H. and Schumacher, M. (2009). Incorporating pathway information into boosting estimation of highdimensional risk prediction models. BMC Bioinformatics. 10:18.
Binder, H. and Schumacher, M. (2008). Allowing for mandatory covariates in boosting estimation of sparse highdimensional survival models. BMC Bioinformatics. 9:14.
Tutz, G. and Binder, H. (2007) Boosting ridge regression. Computational Statistics \& Data Analysis, 51(12):60446059.
Fine, J. P. and Gray, R. J. (1999). A proportional hazards model for the subdistribution of a competing risk. Journal of the American Statistical Association. 94:496509.
predict.CoxBoost
, cv.CoxBoost
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19  # Generate some survival data with 10 informative covariates
n < 200; p < 100
beta < c(rep(1,10),rep(0,p10))
x < matrix(rnorm(n*p),n,p)
real.time < (log(runif(n)))/(10*exp(drop(x %*% beta)))
cens.time < rexp(n,rate=1/10)
status < ifelse(real.time <= cens.time,1,0)
obs.time < ifelse(real.time <= cens.time,real.time,cens.time)
# Fit a Cox proportional hazards model by CoxBoost
cbfit < CoxBoost(time=obs.time,status=status,x=x,stepno=100,penalty=100)
summary(cbfit)
# ... with covariates 1 and 2 being mandatory
cbfit.mand < CoxBoost(time=obs.time,status=status,x=x,unpen.index=c(1,2),
stepno=100,penalty=100)
summary(cbfit.mand)

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.