thresholdStats: Thresholded evaluation statistics

View source: R/thresholdStats.r

thresholdStatsR Documentation

Thresholded evaluation statistics

Description

This function calculates a series of evaluation statistics based on a threshold or thresholds used to convert continuous predictions to binary predictions.

Usage

thresholdStats(
  thresholds,
  pres,
  contrast,
  presWeight = rep(1, length(pres)),
  contrastWeight = rep(1, length(contrast)),
  delta = 0.001,
  na.rm = FALSE,
  bg = NULL,
  bgWeight = NULL,
  ...
)

Arguments

thresholds

Numeric or numeric vector. Threshold(s) at which to calculate sensitivity and specificity.

pres

Numeric vector. Predicted values at test presences

contrast

Numeric vector. Predicted values at background/absence sites.

presWeight

Numeric vector same length as pres. Relative weights of presence sites. The default is to assign each presence a weight of 1.

contrastWeight

Numeric vector same length as contrast. Relative weights of background sites. The default is to assign each presence a weight of 1.

delta

Positive numeric >0 in the range [0, 1] and usually very small. This value is used only if calculating the SEDI threshold when any true positive rate or false negative rate is 0 or the false negative rate is 1. Since SEDI uses log(x) and log(1 - x), values of 0 and 1 will produce NAs. To obviate this, a small amount can be added to rates that equal 0 and subtracted from rates that equal 1.

na.rm

Logical. If TRUE then remove any presences and associated weights and background predictions and associated weights with NAs.

bg

Same as contrast. Included for backwards compatibility. Ignored if contrast is not NULL.

bgWeight

Same as contrastWeight. Included for backwards compatibility. Ignored if contrastWeight is not NULL.

...

Other arguments (unused).

Value

8-column matrix with the following named columns. a = weight of presences >= threshold, b = weight of backgrounds >= threshold, c = weight of presences < threshold, d = weight of backgrounds < threshold, and N = sum of presence and background weights.

  • 'threshold': Threshold

  • 'sensitivity': Sensitivity (a / (a + c))

  • 'specificity': Specificity (d / (d + b))

  • 'ccr': Correct classification rate ((a + d) / N)

  • 'ppp': Positive predictive power (a / (a + b))

  • 'npp': Negative predictive power (d / (c + d))

  • 'mr': Misclassification rate ((b + c) / N)

  • 'orss': Threshold that maximizes the odds ratio skill score (Wunderlich et al. 2019).

  • 'sedi': Threshold that maximizes the symmetrical extremal dependence index (Wunderlich et al. 2019).

References

Fielding, A.H. and J.F. Bell. 1997. A review of methods for the assessment of prediction errors in conservation presence/absence models. Environmental Conservation 24:38-49. Wunderlich, R.F., Lin, P-Y., Anthony, J., and Petway, J.R. 2019. Two alternative evaluation metrics to replace the true skill statistic in the assessment of species distribition models. Nature Conservation 35:97-116.

See Also

threshold, thresholdWeighted, evaluate

Examples

set.seed(123)

# set of bad and good predictions at presences
bad <- runif(100)^2
good <- runif(100)^0.1
hist(good, breaks=seq(0, 1, by=0.1), border='green', main='Presences')
hist(bad, breaks=seq(0, 1, by=0.1), border='red', add=TRUE)
pres <- c(bad, good)
contrast <- runif(1000)
thresholds <- c(0.1, 0.5, 0.9)
thresholdStats(thresholds, pres, contrast)

# upweight bad predictions
presWeight <- c(rep(1, 100), rep(0.1, 100))
thresholdStats(thresholds, pres, contrast, presWeight=presWeight)

# upweight good predictions
presWeight <- c(rep(0.1, 100), rep(1, 100))
thresholdStats(thresholds, pres, contrast, presWeight=presWeight)

adamlilith/enmSdm documentation built on Jan. 6, 2023, 11 a.m.