Description Usage Arguments Value Examples
Balanced K-fold cross-validation based on an "epx
" object.
Hence, we have biased cross-validation as we do not re-run the
phalanx-formation algorithm for each fold.
1 2 3 4 5 6 7 8 9 |
epx |
Object of class " |
folds |
Optional vector specifying to which fold each observation belongs. Must be an n-length vector (n being the number of observations) with integer values only in the range from 1 to K. |
K |
Number of folds; default is 10. |
folds.out |
Indicates whether a vector indicating fold membership for
each of the observations will be output; default is |
classifier.args |
Arguments for the base classifier specified by
|
performance.args |
Arguments for the performance measure specified by
|
... |
Further arguments passed to or from other methods. |
An (n + 1) by (p + 1) matrix, where n is the number
of observations used to train epx
and p is the number of
(final) phalanxes. Column p + 1 of the matrix contains the predicted
probabilities of relevance from the ensemble of phalanxes,
and row n + 1 is the performance (choice of performance measure determined by the
"epx
" object) of the corresponding column.
Setting folds.out
as TRUE
changes the output of
cv.epx
into a list of two elements:
EPX.CV |
The (n + 1) by (p + 1) matrix returned by
default when |
FOLDS.USED |
A vector of length n with integer values only
in the range from 1 to |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | # Example with data(harvest)
## Phalanx-formation using a base classifier with 50 trees (default = 500)
set.seed(761)
model <- epx(x = harvest[, -4], y = harvest[, 4],
classifier.args = list(ntree = 50))
## 10-fold balanced cross-validation (different base classifier settings)
## Not run:
set.seed(761)
cv.100 <- cv.epx(model, classifier.args = list(ntree = 100))
tail(cv.100) # see performance (here, AHR) for all phalanxes and the ensemble
## Option to output the vector assigning observations to the K folds
## (Commented out for speed.)
set.seed(761)
cv.folds <- cv.epx(model, folds.out = TRUE)
tail(cv.folds[[1]]) # same as first example
table(cv.folds[[2]]) # number of observations in each of the 10 folds
## 10 runs of 10-fold balanced cross-validation (using default settings)
set.seed(761)
cv.ahr <- NULL # store AHR of each ensemble
for (i in 1:10) {
cv.i <- cv.epx(model)
cv.ahr <- c(cv.ahr, cv.i[nrow(cv.i), ncol(cv.i)])
}
boxplot(cv.ahr) # to see variation in AHR
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.