getStats: Get model validation statistics for a model object created by...

Description Usage Arguments Value References See Also Examples

View source: R/seegSDM.R

Description

Given an object returned by runBRT, extract devBern, rmse, auc, Kappa, sensitivity and specificity and proportion correctly classified (pcc) validation statistics - calculated using either the PresenceAbsence or seegSDM functions. Note that auc is calculated with a seegSDM clone of the auc function in PresenceAbsence but in which worse-than-random AUC scores are not inverted.

If cv = TRUE the estimates returned are the means of the validation statistics and their standard deviations calculated against the witheld data for each of the n.folds folds in the BRT run. That is, if the arguments n.folds = 10, bag.fraction = 0.75 are passed to runBRT, the resulting BRT model will be an average of 10 separate BRT models, each of them train on a subset of 75% of the data. The validation statistics for each fold are calculated by comparing the predictions of each fold model against the 25% of the data which was witheld for that fold. Estimated standard deviations for these statistics are also calculated (by the functions in the PresenceAbsence package). The mean of these statistics across the 10 folds is what is reported.

If cv = TRUE and pwd = TRUE, these cv statistics are calculated using the pairwise distance sampling procedure (pwdSample) of Hijmans (2012) to ensure that accuracy statistics are not inflated by increasing the pseudo-absence selection distance in presence-only models

If cv = FALSE the statistics are calculated once on the full training set using the final full model. pwd = TRUE cannot be used in this case.

Usage

1
getStats(object, cv = TRUE, pwd = TRUE, threshold = 1, ...)

Arguments

object

A list of BRT model bootstraps, each element being an output from runBRT.

cv

Whether to calculate cross-validation statistics using folds (cv = TRUE) or the training-set validation statistics from the final model(cv = FALSE).

pwd

Whether to use the pairwise distance sampling procedure (pwdSample) of Hijmans (2012) to ensure that accuracy statistics are not inflated by increasing the pseudo-absence selection distance in presence-only models. Note that this procedure can only be used with cv = TRUE and where coordinates are available in object (see argument gbm.coords to runBRT).

threshold

The threshold distance for the pairwise distance sampling procedure, passed directly to the argument tr in pwdSample. This will be ignored if pwd = FALSE.

...

Other arguments to pass to pwdSample, such aslonlat, which you may want to set if the coordinates are not longitude/latitude. Note that the n argument is fixed at one.

Value

A vector giving the mean cross-validation statistics and mean standard deviations for these across the folds (see decription for details).

References

Hijmans, R.J., 2012. Cross-validation of species distribution models: removing spatial sorting bias and calibration with a null-model. Ecology 93: 679-688

See Also

runBRT, PresenceAbsence, Kappa, auc, sensitivity, specificity, pcc, pwdSample

Examples

1
# TO DO

SEEG-Oxford/seegSDM documentation built on May 10, 2017, 10:25 a.m.