| getBins | R Documentation |
Get continuous predicted values into bins according to specific criteria.
getBins(model = NULL, obs = NULL, pred = NULL, id = NULL,
bin.method, n.bins = ifelse(bin.method != "mov.bins", 10, 100),
fixed.bin.size = FALSE, min.bin.size = 15,
min.prob.interval = 0.1, bin.width = "default", quantile.type = 7,
simplif = FALSE, verbosity = 2, na.rm = TRUE, rm.dup = FALSE)
model |
optional binary-response model object of class "glm", "gam", "gbm", "randomForest" or "bart". If this argument is provided, 'obs' and 'pred' will be extracted with |
obs |
alternatively to 'model' and together with 'pred', a numeric vector of observed presences (1) and absences (0) of a binary response variable. Alternatively (and if 'pred' is a 'SpatRaster'), a two-column matrix or data frame containing, respectively, the x (longitude) and y (latitude) coordinates of the presence points, in which case the 'obs' vector will be extracted with |
pred |
alternatively to 'model' and together with 'obs', a vector with the corresponding predicted values of presence probability, habitat suitability, environmental favourability or alike. Must be of the same length and in the same order as 'obs'. Alternatively (and if 'obs' is a set of point coordinates), a 'SpatRaster' map of the predicted values for the entire evaluation region, in which case the 'pred' vector will be extracted with |
id |
optional vector of row identifiers; must be of the same length and in the same order of |
bin.method |
the method with which to divide the values into bins. Type modEvAmethods("getBins") for available options and see Details for more information on these methods. |
n.bins |
the number of bins in which to divide the data. |
fixed.bin.size |
logical, whether all bins should have (approximately) the same size. |
min.bin.size |
integer value defining the minimum number of observations to include in each bin. The default is 15, the minimum required for accurate comparisons within bins (Jovani & Tella 2006, Jimenez-Valverde et al. 2013). |
bin.width |
width of the moving window (if method = "mov.bins"), in the units of 'pred' (e.g. 0.1). By default, it is 1/10th of the 'pred' range. |
min.prob.interval |
minimum range of prredicted values in each bin (default 0.1). |
quantile.type |
argument to pass to |
simplif |
logical (default FALSE), whether to calculate a faster, simplified version (used internally in other functions). |
verbosity |
integer specifying the amount of console messages or warnings to display. Defaults to the maximum implemented; lower numbers (down to 0) decrease the number of messages. |
na.rm |
logical, whether to remove (with a warning saying how many) rows with NA in any of the 'obs' or 'pred' values. The default is TRUE, as some 'bin.method' options will fail if there are NAs. |
rm.dup |
logical (default FALSE). If |
Mind that different bin.methods can lead to visibly different results regarding the bins and any operations that depend on them (such as HLfit and Boyce). Currently available bin.methods are:
- round.prob: predicted values are rounded to the number of digits of min.prob.interval - e.g., if min.prob.interval = 0.1 (the default), values under 0.05 get into bin 1 (rounded probability = 0), values between 0.05 and 0.15 get into bin 2 (rounded probability = 0.1), etc. until values with probability over 0.95, which get into bin 11. Arguments n.bins, fixed.bin.size and min.bin.size are ignored by this bin.method.
- prob.bins: predicted values are grouped into bins of the given intervals - e.g., if min.prob.interval = 0.1 (the default), bin 1 gets the values between 0 and 0.1, bin 2 gets the values between 0.1 and 0.2, etc. until bin 10, which gets the values between 0.9 and 1. Arguments n.bins, fixed.bin.size and min.bin.size are ignored by this bin.method.
- size.bins: predicted values are grouped into bins of (approximately) equal size, defined by argument min.bin.size. Arguments n.bins and min.prob.interval are ignored by this bin.method.
- n.bins: predicted values are divided into the number of bins given by argument n.bins, and their sizes may or may not be forced to be (approximately) equal, depending on argument fixed.bin.size (which is FALSE by default). Arguments min.bin.size and min.prob.interval are ignored by this bin.method.
- quantiles: predicted values are divided using R function quantile, with cutpoints defined by the given n.bins (i.e., deciles by default), and with the quantile algorithm defined by argument quantile.type. Arguments fixed.bin.size, min.bin.size and min.prob.interval are ignored by this bin.method.
- mov.bins: predicted values are grouped into moving bins (i.e., bins defined by a moving window), which partially overlap, with cutpoints defined by bin.width. Arguments fixed.bin.size, min.bin.size and min.prob.interval are ignored by this bin.method.
The output of getBins is a list with the following components:
prob.bin |
the bin in which each observation falls. |
bins.table |
a data frame with the sample size, number of presences, number of absences, prevalence, mean and median predicted value, and the difference between predicted and observed values (mean predicted value minus observed prevalence) in each bin. |
N |
the total number of observations in the analysis. |
n.bins |
the total number of bins obtained. |
If bin.method = "mov.bins", currently the output is a list with only the 'bins.table', which includes two additional columns at the beginning, specifying the first (minimum) and last (maximum) predicted value of each bin.
This function is still under development and may fail for some datasets and binning methods (e.g., ties may sometimes preclude binning under some bin.methods). Fixes and further binning methods are in preparation. Feedback is welcome.
A. Marcia Barbosa
Jimenez-Valverde A., Acevedo P., Barbosa A.M., Lobo J.M. & Real R. (2013) Discrimination capacity in species distribution models depends on the representativeness of the environmental domain. Global Ecology and Biogeography 22: 508-516
Jovani R. & Tella J.L. (2006) Parasite prevalence and sample size: misconceptions and solutions. Trends in Parasitology 22: 214-218
HLfit
# load sample models:
data(rotif.mods)
# choose a particular model to play with:
mod <- rotif.mods$models[[1]]
# try getBins using different binning methods:
getBins(model = mod, bin.method = "quantiles")
getBins(model = mod, bin.method = "n.bins")
getBins(model = mod, bin.method = "n.bins", fixed.bin.size = TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.