Description Usage Arguments Details Value See Also Examples
Function for the estimation of mstop
via
cross-validation and (generic) functions to print or plot the resuls.
1 2 3 4 5 6 7 8 |
object |
an object of class |
folds |
a weight matrix with number of rows equal to the number of observations. The number of columns corresponds to the number of cross-validation runs. |
grid |
a vector of iterations the empirical risk
is to be evaluated for. Per default the empirical
risks for all iterations |
x |
an object of class |
ylab |
A title for the y axis. |
ylim |
the y limits of the plot. |
main |
the main title of the plot. |
... |
additional arguments to be passed to |
The number of boosting iterations is a hyper-parameter of the
boosting algorithms. Cross-validated estimates of the empirical risk
for different values of mstop
(as given by
grid
) are computed, which are used to choose the
appropriate number of boosting iterations to be applied.
Different forms of cross-validation can be applied, for example,
5-fold or 10-fold cross-validation. Bootstrapping is not implemented
so far. The weights
are defined via the folds
matrix.
A.t.m. they can only be used to specify a learning
sample which consists of observations with weights == 1
and
and an out-of-bag sample with weights == 0
. The latter
is used to determine the empirical risk (negative log likelihood).
If package multicore is available, cv
runs in parallel on cores/processors available. The scheduling
can be changed by the corresponding arguments of
mclapply
(via the dot arguments).
No trace output is given when running in parallel.
cv
returns an object of class cv
, which consists of
a matrix of empirical risks and some further attributes.
cfboost
for model fitting.
See risk
for methods to extract the inbag and
out-of-bag risk and mstop
for functions to
extract the (optimal) stopping iteration (based on cross-validation
or on the inbag and out-of-bag risk).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 | ## fit a model with all observations first
## Not run:
## (as this takes some minutes)
set.seed(1234)
## sample covariates first
X <- matrix(NA, nrow=400, ncol=3)
X[,1] <- runif(400, -1, 1)
X[,2] <- runif(400, -1, 1)
X[,3] <- runif(400, -1, 1)
## time-dependent hazard rate
lambda <- function(time, x){
exp(0 * time + 0.7 * x[1] + x[2]^2)
}
## specify censoring function
cens_fct <- function(time, mean_cens){
censor_time <- rexp(n = length(time), rate = 1/mean_cens)
event <- (time <= censor_time)
t_obs <- apply(cbind(time, censor_time), 1, min)
return(cbind(t_obs, event))
}
daten <- rSurvTime(lambda, X, cens_fct, mean_cens = 5)
ctrl <- boost_control( mstop = 100, risk="none")
## fit (a simple) model
model <- cfboost(Surv(time, event) ~ bbs(x.1) + bbs(x.2) + bbs(x.3),
control = ctrl, data = daten)
## End(Not run)
## 5 -fold cross-validation
## Not run:
## (as this takes some minutes)
n <- nrow(daten)
k <- 5
ntest <- floor(n / k)
cv5f <- matrix(c(rep(c(rep(0, ntest), rep(1, n)), k - 1),
rep(0, n * k - (k - 1) * (n + ntest))), nrow = n)
cvm <- cv(model, folds = cv5f)
print(cvm)
plot(cvm)
mstop(cvm)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.