precalculation | R Documentation |
These predefined precalculation functions can be employed to create own objectives using createObjective
. They perform a reclassification or a cross-validation and return the true labels and the predictions.
reclassification(data, labels,
classifier, classifierParams, predictorParams)
crossValidation(data, labels,
classifier, classifierParams, predictorParams,
ntimes = 10, nfold = 10,
leaveOneOut = FALSE, stratified = FALSE,
foldList = NULL)
data |
The data set to be used for the precalculation. This is usually a matrix or data frame with the samples in the rows and the features in the columns. |
labels |
A vector of class labels for the samples in |
classifier |
A |
classifierParams |
A named list of parameter assignments for the training routine of the classifier. |
predictorParams |
If the classifier consists of separate training and prediction functions, a named list of parameter assignments for the predictor function. |
nfold |
The number of groups of the cross-validation. Ignored if |
ntimes |
The number of repeated runs of the cross-validation. Ignored if |
leaveOneOut |
If this is true, a leave-one-out cross-validation is performed, i.e. each sample is left out once in the training phase and used as a test sample |
stratified |
If set to true, a stratified cross-validation is carried out. That is, the percentage of samples from different classes in the cross-validation folds corresponds to the class sizes in the complete data set. If set to false, the folds may be unbalanced. |
foldList |
If this parameter is set, the other cross-validation parameters ( |
reclassification
trains the classifier with the full data set. Afterwards, the classifier is applied to the same data set.
crossValidate
partitions the samples in the data set into a number of groups (depending on nfold
and leaveOneOut
). Each of these groups is left out once in the training phase and used for prediction. The whole procedure is repeated several times (as specified in ntimes
).
reclassification
returns a list with the following components:
The original labels of the dataset as supplied in labels
A vector of predicted labels of the data set
The TuneParetoModel
object resulting from the classifier training
crossValidation
returns a nested list structure. At the top level, there is one list element for each run of the cross-validation. Each of these elements consists of a list of sub-structures for each fold. The sub-structures have the following components:
The original labels of the test samples in the fold
A vector of predicted labels of the test samples in the fold
The TuneParetoModel
object resulting from the classifier training in the fold
That is, for a cross-validation with n
runs and m
folds, there are n
top-level lists, each having m
sub-lists comprising the true labels and the predicted labels.
createObjective
, generateCVRuns
.
# create new objective minimizing the
# false positives of a reclassification
cvFalsePositives <- function(nfold=10, ntimes=10, leaveOneOut=FALSE, foldList=NULL, caseClass)
{
return(createObjective(
precalculationFunction = "crossValidation",
precalculationParams = list(nfold=nfold,
ntimes=ntimes,
leaveOneOut=leaveOneOut,
foldList=foldList),
objectiveFunction =
function(result, caseClass)
{
# take mean value over the cv runs
return(mean(sapply(result,
function(run)
# iterate over runs of cross-validation
{
# extract all predicted labels in the folds
predictedLabels <-
unlist(lapply(run,
function(fold)fold$predictedLabels))
# extract all true labels in the folds
trueLabels <-
unlist(lapply(run,
function(fold)fold$trueLabels))
# calculate number of false positives in the run
return(sum(predictedLabels == caseClass &
trueLabels != caseClass))
})))
},
objectiveFunctionParams = list(caseClass=caseClass),
direction = "minimize",
name = "CV.FalsePositives"))
}
# use the objective in an SVM cost parameter tuning on the 'iris' data set
r <- tunePareto(data = iris[, -ncol(iris)],
labels = iris[, ncol(iris)],
classifier = tunePareto.svm(),
cost = c(0.001,0.005,0.01,0.05,0.1,0.5,1,5,10,50),
objectiveFunctions=list(cvFalsePositives(10, 10, caseClass="setosa")))
print(r)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.