View source: R/summaryByCrossValid.r
summaryByCrossValid | R Documentation |
This function summarizes models calibrated using the trainByCrossValid
function. It returns aspects of the best models across k-folds (the particular aspects depends on the kind of models used).
summaryByCrossValid( x, trainFxName = "trainGlm", metric = "cbiTest", decreasing = TRUE )
x |
An object of class |
trainFxName |
Character, name of function used to train the SDM (examples: |
metric |
Metric by which to select the best model in each k-fold. This can be any of the columns that appear in the data frames in
|
decreasing |
Logical, if |
Data frame with statistics on the best set of models across k-folds. Depending on the model algorithm, this could be:
BRTs (boosted regression trees): Learning rate, tree complexity, and bag fraction.
GLMs (generalized linear models): Frequency of use of each term in the best models.
Maxent: Frequency of times each specific combination of feature classes was used in the best models plus mean master regularization multiplier for each feature set.
NSs (natural splines): Data frame, one row per fold and one column per predictor, with values representing the maximum degrees of freedom used for each variable in the best model of each fold.
trainByCrossValid
## Not run: set.seed(123) ### contrived example # generate training/testing data n <- 10000 x1 <- seq(-1, 1, length.out=n) + rnorm(n) x2 <- seq(10, 0, length.out=n) + rnorm(n) x3 <- rnorm(n) y <- 2 * x1 + x1^2 - 10 * x2 - x1 * x2 y <- statisfactory::probitAdj(y, 0) y <- y^3 presAbs <- runif(n) < y data <- data.frame(presAbs=presAbs, x1=x1, x2=x2, x3=x3) model <- trainGlm(data) summary(model) folds <- dismo::kfold(data, 3) out <- trainByCrossValid(data, folds=folds, verbose=1) summaryByCrossValid(out) str(out, 1) head(out$tuning[[1]]) head(out$tuning[[2]]) head(out$tuning[[3]]) # can do following for each fold (3 of them) lapply(out$models[[1]], coefficients) sapply(out$models[[1]], logLik) sapply(out$models[[1]], AIC) # select model for k = 1 with greatest CBI top <- which.max(out$tuning[[1]]$cbiTest) summary(out$models[[1]][[top]]) # in fold k = 1, which models perform well but aren not overfit? plot(out$tuning[[1]]$cbiTrain, out$tuning[[1]]$cbiTest, pch='.', main='Model Numbers for k = 1') abline(0, 1, col='red') numModels <- nrow(out$tuning[[1]]) text(out$tuning[[1]]$cbiTrain, out$tuning[[1]]$cbiTest, labels=1:numModels) usr <- par('usr') x <- usr[1] + 0.9 * (usr[4] - usr[3]) y <- usr[3] + 0.1 * (usr[4] - usr[3]) text(x, y, labels='overfit', col='red', xpd=NA) x <- usr[1] + 0.1 * (usr[4] - usr[3]) y <- usr[3] + 0.9 * (usr[4] - usr[3]) text(x, y, labels='suspicious', col='red', xpd=NA) ## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.