View source: R/genericFunctions.R
generic.cv | R Documentation |
Performs k-fold cross-validation 'n' times for any specified algorithm, using two of many metrics(test error, AUC, precision,...)
generic.cv(X, Y, nTimes = 1, k = 10, seed = 2014, regression = TRUE, genericAlgo = NULL, specificPredictFunction = NULL, metrics = c("none", "AUC", "precision", "F-score", "L1", "geometric mean", "geometric mean (precision)"))
X |
a matrix or dataframe of observations |
Y |
a vector (a factor for classification) for the observed data. |
nTimes |
number of times that k-fold cross-validation need to be performed. |
k |
how many folds ? |
seed |
the seed for reproducibility. |
regression |
if TRUE, performs regression. |
genericAlgo |
wrapper function to embed the algorithm that one needs to assess. One can eventually add options. NULL is only for convenience. Wrapper function is needed to assess cross-validation. |
specificPredictFunction |
if the assessed model does not support the R generic method 'predict', one has to define here, with a function, how predictions have to be generated. |
metrics |
One of many other metrics one can call with the standard one, test error (or MSE for regression). |
a list with the following components :
testError |
the values of test error. |
avgError |
mean of test error. |
stdDev |
standard deviation of test error. |
metric |
values of the other chosen metric. |
Saip Ciss saip.ciss@wanadoo.fr
## not run # data(iris) # Y <- iris$Species # X <- iris[,-which(colnames(iris) == "Species")] ## 10-fold cross-validation for the randomUniformForest algorithm: ## create the wrapper function (setting 'threads = 1' since data are small) # genericAlgo.ruf <- function(X, Y) randomUniformForest(X, Y, # OOB = FALSE, importance = FALSE, threads = 1) ## run # rUF.10cv.iris <- generic.cv(X, as.factor(Y), # genericAlgo = genericAlgo.ruf, regression = FALSE) ## 10-fold cross-validation for the randomForest algorithm: ## create the wrapper function # require(randomForest) || install.packages("randomForest") # genericAlgo.rf <- function(X, Y) randomForest(X, Y) ## run # RF.10cv.iris <- generic.cv(X, as.factor(Y), # genericAlgo = genericAlgo.rf, regression = FALSE) ## 10-fold cross-validation for Gradient Boosting Machines algorithm (gbm package) ## create the wrapper function # require(gbm) || install.packages("gbm") # genericAlgo.gbm <- function(X, Y) gbm.fit(X, Y, distribution = "multinomial", # n.trees = 500, shrinkage = 0.05, interaction.depth = 24, n.minobsinnode = 1) ## create a wrapper for the prediction function of gbm # nClasses = length(unique(Y)) # specificPredictFunction.gbm <- function(model, newdata) # { # modelPrediction = predict(model, newdata, 500) # predictions = matrix(modelPrediction, ncol = nClasses ) # colnames(predictions) = colnames(modelPrediction) # return(as.factor(apply(predictions, 1, function(Z) names(which.max(Z))))) # } ## run # gbm.10cv.iris <- generic.cv(X, Y, genericAlgo = genericAlgo.gbm, # specificPredictFunction = specificPredictFunction.gbm, regression = FALSE) ## 10-fold cross-validation for CART algorithm (rpart package): # genericAlgo.CART <- function(X, Y) #{ # ZZ = data.frame(Y, X) # if (is.factor(Y)) { modelObject = rpart(Y ~., data = ZZ, method = "class", ...) } # else { modelObject = rpart(Y ~., data = ZZ, ...) } # return(modelObject) #} # specificPredictFunction.CART <- function(model, newdata) # predict(model, data.frame(newdata), type= "vector") # CART.10cv.iris <- generic.cv(X, as.factor(Y), genericAlgo = genericAlgo.CART, # specificPredictFunction = specificPredictFunction.CART, regression = FALSE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.