kfold_occurrence_background: Create k folds of occurrence and background data for...
In lifewatch/marinespeed: Benchmark Data Sets and Functions for Marine Species Distribution Modelling

Description Usage Arguments Details Value References See Also Examples

kfold_occurrence_background creates a k-fold partitioning of occurrence and background data for cross-validation using random and stratified folds. Returns a list with the occurrence folds and the background folds, folds are represented as TRUE/FALSE/NA columns of a dataframe, 1 column for each fold.

1
2
3

kfold_occurrence_background(occurrence_data, background_data,
  occurrence_fold_type = "disc", k = 5, pwd_sample = TRUE, lonlat = TRUE,
  background_buffer = 200*1000)

`occurrence_data`	Dataframe. Occurrence points of the species, the first column should be the scientific name of the species followed by two columns representing the longitude and latitude (or x,y coordinates if `lonlat = FALSE`).
`background_data`	Dataframe. Background data points, the first column is a dummy column followed by two columns representing the longitude and latitude (or x,y coordinates if `lonlat = FALSE`).
`occurrence_fold_type`	Character vector. How occurrence folds should be generated, currently `"disc"` (see `kfold_disc`), `"grid"` (see `kfold_grid`) and `"random"` are supported.
`k`	Integer. The number of folds (partitions) that have to be created. By default 5 folds are created.
`pwd_sample`	Logical. Whether backgound points should be picked by doing pair-wise distance sampling (see `pwdSample`). It is recommended to install the FNN package if you want to do pair-wise distance sampling.
`lonlat`	Logical. If `TRUE` (default) then Great Circle distances are calculated else if `FALSE` Euclidean (planar) distances are calculated.
`background_buffer`	Positive numeric. Distance in meters around species test points where training background data should be excluded from. Use `NA` or a negative number to disable background point filtering.

Note that which and how many background points get selected in each fold depends on the fold_type, pwd_sample and the background_buffer and whether pwd_sample is TRUE or FALSE, even leading in some cases to the selection of no background data. Background points that are neither selected for the training fold nor for the test fold are set to NA in the background folds. Random assignment of background points to the folds can be achieved by setting pwd_sample to FALSE and background_buffer to 0. Note also that when pwd_sample is TRUE, the same background point might be assigned to different folds.

A list with 2 dataframes, occurrence and background, with as first column the scientifc name or "background" and k columns containing TRUE, FALSE or NA.

Hijmans, R. J. (2012). Cross-validation of species distribution models: removing spatial sorting bias and calibration with a null model. Ecology, 93(3), 679-688. doi:10.1890/11-0826.1 Radosavljevic, A., & Anderson, R. P. (2013). Making better Maxent models of species distributions: complexity, overfitting and evaluation. Journal of Biogeography. doi:10.1111/jbi.12227

lapply_kfold_species, kfold_disc, kfold_grid, geographic_filter pwdSample, kfold

set.seed(42)
occurrence_data <- data.frame(species = rep("Abalistes stellatus", 50),
                              longitude = runif(50, -180, 180),
                              latitude = runif(50, -90, 90))

# REMARK: this is NOT how you would want to create random background point.
# Use special functions for this like dismo::randomPoints, especially for
# lonlat data
background_data <- data.frame(species = rep("background", 500),
                              longitude = runif(500, -180, 180),
                              latitude = runif(500, -90, 90))
disc_folds <- kfold_occurrence_background(occurrence_data, background_data,
                                          "disc")
random_folds <- kfold_occurrence_background(occurrence_data, background_data,
                                            "random", pwd_sample = FALSE,
                                            background_buffer = NA)

lifewatch/marinespeed documentation built on Dec. 19, 2019, 2:59 a.m.

lifewatch/marinespeed index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

lifewatch/marinespeed
Benchmark Data Sets and Functions for Marine Species Distribution Modelling

kfold_occurrence_background: Create k folds of occurrence and background data for...
In lifewatch/marinespeed: Benchmark Data Sets and Functions for Marine Species Distribution Modelling

Description

Usage

Arguments

Details

Value

References

See Also

Examples

Related to kfold_occurrence_background in lifewatch/marinespeed...

R Package Documentation

Browse R Packages

We want your feedback!

lifewatch/marinespeed Benchmark Data Sets and Functions for Marine Species Distribution Modelling

kfold_occurrence_background: Create k folds of occurrence and background data for... In lifewatch/marinespeed: Benchmark Data Sets and Functions for Marine Species Distribution Modelling

Description

Usage

Arguments

Details

Value

References

See Also

Examples

Related to kfold_occurrence_background in lifewatch/marinespeed...

R Package Documentation

Browse R Packages

We want your feedback!

lifewatch/marinespeed
Benchmark Data Sets and Functions for Marine Species Distribution Modelling

kfold_occurrence_background: Create k folds of occurrence and background data for...
In lifewatch/marinespeed: Benchmark Data Sets and Functions for Marine Species Distribution Modelling