aucMultiWeighted: Calculate multivariate weighted AUC
In adamlilith/enmSdm: Tools for Modeling Niches and Distributions of Species

aucMultiWeighted

R Documentation

Calculate multivariate weighted AUC

Description

This function calculates a multivariate version of the area under the receiver-operator characteristic curve (AUC). The multivariate version is simply the mean AUC across all possible pairwise AUCs for all cases (Hand & Till 2001). For example, if we have predictions that can be classified into three groups of expectation, say A, B, and C, where we expect predictions assigned to group A are > those in B and C, and predictions in group B are expected to be > those in group C, the multivariate AUC for this situation is mean(wAB * auc_mean(A, B), wAC * auc_mean(A, C), wBC * auc_mean(B, C)), where auc_mean(X, Y), is the AUC calculated between cases X and Y, and wXY is a weight fr that case-comparison.

Usage

aucMultiWeighted(..., weightBySize = FALSE, na.rm = FALSE)

Arguments

`...`	A set of two or more numeric vectors or two or more 2-column matrices or data frames. The objects must be listed in order of expected probability. For example, you might have a set of predictions for objects you expect to have a low predicted probability (e.g., long-term absences of an animal), a set that you expect to have middle levels of probability (e.g., sites that were recently vacated), and a set for which you expect a high level of predicted probability (e.g., sites that are currently occupied). In this case you should list the cases in order: low, middle, high. If a 2-column matrix or data frame is supplied, then the first column is assumed to represent predictions and the second assumed to represent site-level weights (see `aucWeightd`). Note that site-level weighting is different from case-level weighting.
`weightBySize`	Logical, if `FALSE` (default) then the multivariate measure of AUC will treat all comparisons as equal (e.g., low versus middle will weigh as much as middle versus high), and so will simply be the mean AUC across all possible comparisons. If `TRUE` then multivariate AUC is the weighted mean across all possible comparisons where weights are the number of comparisons between each of the two cases. For example, if a set of "low" predictions ("low") has 10 data points, "middle" has 10, and "high" has 20, then the multivariate AUC will be (10 * low + 10 * middle + 20 * high) / (10 + 10 + 20).
`na.rm`	Logical. If `TRUE` then remove any cases in `...` that are `NA`.

Value

Named numeric vector. The names will appear as case2_over_case1 (which in this example means the AUC of item #1 in the ... when compared to the second item in ...), plus multivariate (which is the multivariate AUC).

References

Hand, DJ and Till, RJ. 2001. A simple generalisation of the area under the ROC curve for multiple class classification problems. Machine Learning 45:171-186 doi: 10.1023/A:1010920819831.

Examples

set.seed(123)

# no weights
low <- runif(10)^2
middle <- runif(10)
high <- sqrt(runif(20))

aucMultiWeighted(low, middle, high)

# equal weights
low <- matrix(c(low, rep(1, length(low))), ncol=2)
middle <- matrix(c(middle, rep(1, length(middle))), ncol=2)
high <- matrix(c(high, rep(1, length(high))), ncol=2)
aucMultiWeighted(low, middle, high)

# equal weights with weighting by number of comparisons
aucMultiWeighted(low, middle, high, weightBySize=TRUE)

# unequal weights
middle[ , 2] <- ifelse(middle[ , 1] > 0.5, 0.1, 1)
aucMultiWeighted(low, middle, high)

# unequal weights with weighting by number of comparisons
aucMultiWeighted(low, middle, high, weightBySize=TRUE)

adamlilith/enmSdm documentation built on Jan. 6, 2023, 11 a.m.