Description Usage Arguments Value Author(s) Examples
Use crossvalidation to fit an L1regularized linear interval
regression model by optimizing margin and/or regularization
parameters.
This function repeatedly calls IntervalRegressionRegularized
, and by
default assumes that margin=1. To optimize the margin,
specify the margin.vec
parameter
manually, or use IntervalRegressionCVmargin
(which takes more computation time
but yields more accurate models).
If the future package is available,
two levels of future_lapply are used
to parallelize on validation.fold and margin.
1 2 3 4 5 6 7 8 9 10 11 12  IntervalRegressionCV(feature.mat,
target.mat, n.folds = ifelse(nrow(feature.mat) <
10, 3L, 5L),
fold.vec = sample(rep(1:n.folds,
l = nrow(feature.mat))),
verbose = 0, min.observations = 10,
reg.type = "min",
incorrect.labels.db = NULL,
initial.regularization = 0.001,
margin.vec = 1, LAPPLY = NULL,
check.unlogged = TRUE,
...)

feature.mat 
Numeric feature matrix, n observations x p features. 
target.mat 
Numeric target matrix, n observations x 2 limits. These should be
realvalued (possibly negative). If your data are interval
censored positivevalued survival times, you need to log them to
obtain 
n.folds 
Number of crossvalidation folds. 
fold.vec 
Integer vector of fold id numbers. 
verbose 
numeric: 0 for silent, bigger numbers (1 or 2) for more output. 
min.observations 
stop with an error if there are fewer than this many observations. 
reg.type 
Either "1sd" or "min" which specifies how the regularization parameter is chosen during the internal crossvalidation loop. min: first take the mean of the KCV error functions, then minimize it (this is the default since it tends to yield the least test error). 1sd: take the most regularized model with the same margin which is within one standard deviation of that minimum (this model is typically a bit less accurate, but much less complex, so better if you want to interpret the coefficients). 
incorrect.labels.db 
either NULL or a data.table, which specifies the error function to
compute for selecting the regularization parameter on the
validation set. NULL means to minimize the squared hinge loss,
which measures how far the predicted log(penalty) values are from
the target intervals. If a data.table is specified, its first key
should correspond to the rownames of 
initial.regularization 
Passed to 
margin.vec 
numeric vector of margin size hyperparameters. The computation
time is linear in the number of elements of 
LAPPLY 
Function to use for parallelization, by default

check.unlogged 
If TRUE, stop with an error if target matrix is nonnegative and has any big difference in successive quantiles (this is an indicator that the user probably forgot to log their outputs). 
... 
passed to 
List representing regularized linear model.
Toby Dylan Hocking
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29  if(interactive()){
library(penaltyLearning)
data("neuroblastomaProcessed", package="penaltyLearning", envir=environment())
if(require(future)){
plan(multiprocess)
}
set.seed(1)
i.train < 1:100
fit < with(neuroblastomaProcessed, IntervalRegressionCV(
feature.mat[i.train,], target.mat[i.train,],
verbose=0))
## When only features and target matrices are specified for
## training, the squared hinge loss is used as the metric to
## minimize on the validation set.
plot(fit)
## Create an incorrect labels data.table (first key is same as
## rownames of feature.mat and target.mat).
library(data.table)
errors.per.model < data.table(neuroblastomaProcessed$errors)
errors.per.model[, pid.chr := paste0(profile.id, ".", chromosome)]
setkey(errors.per.model, pid.chr)
set.seed(1)
fit < with(neuroblastomaProcessed, IntervalRegressionCV(
feature.mat[i.train,], target.mat[i.train,],
## The incorrect.labels.db argument is optional, but can be used if
## you want to use AUC as the CV model selection criterion.
incorrect.labels.db=errors.per.model))
plot(fit)
}

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.