blockCV: blockCV: A package for generating spatially or...

Description Author(s) References See Also

Description

Simple random selection of testing and training folds in structured environment leads to underestimation of error in the evaluation of spatial predictions and may result in inappropriate model selection (Telford and Birks, 2009; Roberts et al., 2017). The use of spatial and environmental blocks to separate test and train sets has been suggested as a good strategy for realistic error estimation in datasets with dependence structures, and more generally as a robust method for estimating predictive performance of models used to predict mapped distributions (Roberts et al., 2017). Package blockCV provides functions to separate train and test sets using buffers, spatial and environmental blocks. It provides several options for how those blocks are constructed. It also has a function that applies geostatistical techniques to investigate the existing level of spatial autocorrelation in the covariates to inform the choice of a suitable distance band by which to separate the data sets. In addition, some visualization tools are provided to help the user choose the block size and explore generated folds. The package has been written with species distribution modelling in mind, and the functions allow for a number of common scenarios (including presence-absence and presence-background species data, rare and common species, raster data for predictor variables).

Author(s)

Roozbeh Valavi, Jane Elith, José Lahoz-Monfort and Gurutzeta Guillera-Arroita

References

Roberts et al., 2017. Cross-validation strategies for data with temporal, spatial, hierarchical, or phylogenetic structure. Ecography. 40: 913-929.

Telford, R.J., Birks, H.J.B., 2009. Evaluation of transfer functions in spatially structured environments. Quat. Sci. Rev. 28, 1309–1316.

See Also

spatialBlock, buffering and envBlock for blocking strategies.


adamlilith/blockCV documentation built on May 25, 2019, 12:41 a.m.