IAUC: Influence Functions On AUC

View source: R/influenceAUC.R

IAUCR Documentation

Influence Functions On AUC

Description

Provide two sample versions (DEIF and SIF) of influence function on the AUC.

Usage

IAUC(
  score,
  binary,
  threshold = 0.5,
  hypothesis = FALSE,
  testdiff = 0.5,
  alpha = 0.05,
  name = NULL
)

Arguments

score

A vector containing the predictions (continuous scores) assigned by classifiers; Must be numeric.

binary

A vector containing the true class labels 1: positive and 0: negative. Must have the same dimensions as 'score.'

threshold

A numeric value determining the threshold to distinguish influential observations from normal ones; Must lie between 0 and 1; Defaults to 0.5.

hypothesis

Logical which controls the evaluation of SIF under asymptotic distribution.

testdiff

A numeric value determining the difference in the hypothesis testing; Must lie between 0 and 1; Defaults to 0.5.

alpha

A numeric value determining the significance level in the hypothesis testing; Must lie between 0 and 1; Defaults to 0.05.

name

A vector comprising the appellations for observations; Must have the same dimensions as 'score'.

Details

Apply two sample versions of influence functions on AUC:

  • deleted empirical influence function (DEIF)

  • sample influence function (SIF)

The concept of influence function focuses on the deletion diagnostics; nevertheless, such techniques may face masking effect due to multiple influential observations. To thoroughly investigate the potential cases in binary classification, we suggest end-users to apply ICLC and LAUC as well. For a complete discussion of these functions, please see the reference.

Value

A list of objects including (1) 'output': a list of results with 'AUC' (numeric), 'SIF' (a list of dataframes) and 'DEIF' (a list of dataframes)); (2) 'rdata': a dataframe of essential results for visualization (3) 'threshold': a used numeric value to distinguish influential observations from normal ones; (4) 'test_output': a list of dataframes for hypothesis testing result; (5) 'test_data': a dataframe of essential results in hypothesis testing for visualization (6) 'testdiff': a used numeric value to determine the difference in the hypothesis testing; (7) 'alpha': a used nuermic value to determine the significance level.

Author(s)

Bo-Shiang Ke and Yuan-chin Ivan Chang

References

Ke, B. S., Chiang, A. J., & Chang, Y. C. I. (2018). Influence Analysis for the Area Under the Receiver Operating Characteristic Curve. Journal of biopharmaceutical statistics, 28(4), 722-734.

See Also

ICLC, LAUC

Examples

library(ROCR)
data("ROCR.simple")
# print out IAUC results directly
IAUC(ROCR.simple$predictions,ROCR.simple$labels,hypothesis = "True")

data(mtcars)
glmfit <- glm(vs ~ wt + disp, family = binomial, data = mtcars)
prob <- as.vector( predict(glmfit, newdata = mtcars,type = "response"))
output <- IAUC(prob, mtcars$vs, threshold = 0.3, testdiff = 0.3,
               hypothesis = TRUE, name = rownames(mtcars))
# Show results
print(output)
# Visualize results
plot(output)

BoShiangKe/InfluenceAUC documentation built on Nov. 4, 2024, 2:48 a.m.