thresholdStats: Thresholded evaluation statistics
In adamlilith/enmSdm: Tools for Modeling Niches and Distributions of Species

thresholdStats

R Documentation

Thresholded evaluation statistics

Description

This function calculates a series of evaluation statistics based on a threshold or thresholds used to convert continuous predictions to binary predictions.

Usage

thresholdStats(
  thresholds,
  pres,
  contrast,
  presWeight = rep(1, length(pres)),
  contrastWeight = rep(1, length(contrast)),
  delta = 0.001,
  na.rm = FALSE,
  bg = NULL,
  bgWeight = NULL,
  ...
)

Arguments

`thresholds`	Numeric or numeric vector. Threshold(s) at which to calculate sensitivity and specificity.
`pres`	Numeric vector. Predicted values at test presences
`contrast`	Numeric vector. Predicted values at background/absence sites.
`presWeight`	Numeric vector same length as `pres`. Relative weights of presence sites. The default is to assign each presence a weight of 1.
`contrastWeight`	Numeric vector same length as `contrast`. Relative weights of background sites. The default is to assign each presence a weight of 1.
`delta`	Positive numeric >0 in the range [0, 1] and usually very small. This value is used only if calculating the SEDI threshold when any true positive rate or false negative rate is 0 or the false negative rate is 1. Since SEDI uses log(x) and log(1 - x), values of 0 and 1 will produce `NA`s. To obviate this, a small amount can be added to rates that equal 0 and subtracted from rates that equal 1.
`na.rm`	Logical. If `TRUE` then remove any presences and associated weights and background predictions and associated weights with `NA`s.
`bg`	Same as `contrast`. Included for backwards compatibility. Ignored if `contrast` is not `NULL`.
`bgWeight`	Same as `contrastWeight`. Included for backwards compatibility. Ignored if `contrastWeight` is not `NULL`.
`...`	Other arguments (unused).

Value

8-column matrix with the following named columns. a = weight of presences >= threshold, b = weight of backgrounds >= threshold, c = weight of presences < threshold, d = weight of backgrounds < threshold, and N = sum of presence and background weights.

'threshold': Threshold
'sensitivity': Sensitivity (a / (a + c))
'specificity': Specificity (d / (d + b))
'ccr': Correct classification rate ((a + d) / N)
'ppp': Positive predictive power (a / (a + b))
'npp': Negative predictive power (d / (c + d))
'mr': Misclassification rate ((b + c) / N)
'orss': Threshold that maximizes the odds ratio skill score (Wunderlich et al. 2019).
'sedi': Threshold that maximizes the symmetrical extremal dependence index (Wunderlich et al. 2019).

References

Fielding, A.H. and J.F. Bell. 1997. A review of methods for the assessment of prediction errors in conservation presence/absence models. Environmental Conservation 24:38-49. Wunderlich, R.F., Lin, P-Y., Anthony, J., and Petway, J.R. 2019. Two alternative evaluation metrics to replace the true skill statistic in the assessment of species distribition models. Nature Conservation 35:97-116.

Examples

set.seed(123)

# set of bad and good predictions at presences
bad <- runif(100)^2
good <- runif(100)^0.1
hist(good, breaks=seq(0, 1, by=0.1), border='green', main='Presences')
hist(bad, breaks=seq(0, 1, by=0.1), border='red', add=TRUE)
pres <- c(bad, good)
contrast <- runif(1000)
thresholds <- c(0.1, 0.5, 0.9)
thresholdStats(thresholds, pres, contrast)

# upweight bad predictions
presWeight <- c(rep(1, 100), rep(0.1, 100))
thresholdStats(thresholds, pres, contrast, presWeight=presWeight)

# upweight good predictions
presWeight <- c(rep(0.1, 100), rep(1, 100))
thresholdStats(thresholds, pres, contrast, presWeight=presWeight)

adamlilith/enmSdm documentation built on Jan. 6, 2023, 11 a.m.