Description Usage Arguments Value Examples
Runs cross-validation of Random Forests classification
1 | rf.cross.validation(x, y, nfolds = 10, folds = NULL, verbose = FALSE, ...)
|
x |
A matrix of numeric predictors |
y |
A factor or logical of responses |
nfolds |
number of cross-validation folds (nfolds==-1 means leave-one-out) |
folds |
optional, if not |
verbose |
Use verbose output |
... |
additional parameters for randomForest |
y |
Observed values |
predicted |
CV predicted values |
probabilities |
CV predicted class probabilities (or NULL if unavailable) |
confusion.matrix |
Confusion matrix (true-by-predicted) |
nfolds |
Number of folds used (or -1 for leave-one-out) |
params |
List of additional parameters, if any |
importances |
Feature-by-fold matrix of importances of features (mean decrease in accuracy) across folds |
final.model |
Final random forests model trained on whole data set |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | # generate fake data: 50 samples, 20 variables
x <- matrix(rnorm(2000),100)
rownames(x) <- sprintf('Sample%02d',1:100)
colnames(x) <- sprintf('Feature%02d',1:20)
# generate fake response variable, function of columns 1-5 plus noise
y <- rowSums(sweep(x[,1:5],2,runif(5),'*')) + rnorm(100,sd=.5)
# y is now binary response variable
y <- factor(y > median(y))
names(y) <- rownames(x)
# 10-fold cross validation, returning predictions
res.rf <- rf.cross.validation(x,y2)
# plot importance of top ten variables
sorted.importances <- sort(rowMeans(res.rf$importances), decreasing=TRUE)
barplot(rev(sorted.importances[1:10]),horiz=TRUE, xlab='Mean Decrease in Accuracy')
# Plot classification ROC curve
roc <- probabilities.to.ROC(res.rf$probabilities[,2], y2,plot=TRUE)
# Report ROC AUC
cat('ROC AUC was:',roc$auc,'\n')
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.