IAUC | R Documentation |
Provide two sample versions (DEIF and SIF) of influence function on the AUC.
IAUC(
score,
binary,
threshold = 0.5,
hypothesis = FALSE,
testdiff = 0.5,
alpha = 0.05,
name = NULL
)
score |
A vector containing the predictions (continuous scores) assigned by classifiers; Must be numeric. |
binary |
A vector containing the true class labels 1: positive and 0: negative. Must have the same dimensions as 'score.' |
threshold |
A numeric value determining the threshold to distinguish influential observations from normal ones; Must lie between 0 and 1; Defaults to 0.5. |
hypothesis |
Logical which controls the evaluation of SIF under asymptotic distribution. |
testdiff |
A numeric value determining the difference in the hypothesis testing; Must lie between 0 and 1; Defaults to 0.5. |
alpha |
A numeric value determining the significance level in the hypothesis testing; Must lie between 0 and 1; Defaults to 0.05. |
name |
A vector comprising the appellations for observations; Must have the same dimensions as 'score'. |
Apply two sample versions of influence functions on AUC:
deleted empirical influence function (DEIF)
sample influence function (SIF)
The concept of influence function focuses on the deletion diagnostics; nevertheless, such techniques may face masking effect due to multiple influential observations.
To thoroughly investigate the potential cases in binary classification, we suggest end-users to apply ICLC
and LAUC
as well. For a complete discussion of these functions, please see the reference.
A list of objects including (1) 'output': a list of results with 'AUC' (numeric), 'SIF' (a list of dataframes) and 'DEIF' (a list of dataframes)); (2) 'rdata': a dataframe of essential results for visualization (3) 'threshold': a used numeric value to distinguish influential observations from normal ones; (4) 'test_output': a list of dataframes for hypothesis testing result; (5) 'test_data': a dataframe of essential results in hypothesis testing for visualization (6) 'testdiff': a used numeric value to determine the difference in the hypothesis testing; (7) 'alpha': a used nuermic value to determine the significance level.
Bo-Shiang Ke and Yuan-chin Ivan Chang
Ke, B. S., Chiang, A. J., & Chang, Y. C. I. (2018). Influence Analysis for the Area Under the Receiver Operating Characteristic Curve. Journal of biopharmaceutical statistics, 28(4), 722-734.
ICLC
, LAUC
library(ROCR)
data("ROCR.simple")
# print out IAUC results directly
IAUC(ROCR.simple$predictions,ROCR.simple$labels,hypothesis = "True")
data(mtcars)
glmfit <- glm(vs ~ wt + disp, family = binomial, data = mtcars)
prob <- as.vector( predict(glmfit, newdata = mtcars,type = "response"))
output <- IAUC(prob, mtcars$vs, threshold = 0.3, testdiff = 0.3,
hypothesis = TRUE, name = rownames(mtcars))
# Show results
print(output)
# Visualize results
plot(output)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.