compute_abroca: Compute the value of the abroca statistic.

Description Usage Arguments Value Examples

View source: R/compute_abroca.R

Description

Compute the value of the abroca statistic.

Usage

1
2
3
compute_abroca(df, pred_col, label_col, protected_attr_col,
  majority_protected_attr_val, n_grid = 10000, plot_slices = TRUE,
  image_dir = NULL, identifier = NULL)

Arguments

df

dataframe containing colnames matching pred_col, label_col, and protected_attr_col

pred_col

name of column containing predicted probabilities (string)

label_col

name of column containing true labels (should be 0,1 only) (string)

protected_attr_col

name of column containing protected attribute (string)

majority_protected_attr_val

name of 'majority' group with respect to protected attribute (string)

n_grid

number of grid points to use in approximation (numeric) (default of 10000 is more than adequate for most cases)

plot_slices

if TRUE, ROC slice plots are generated and saved to img_dir (boolean)

image_dir

directory to save images to (string)

identifier

identifier name, used for filenames if plot_slices is set to TRUE (boolean)

Value

Value of slice statistic, the absolute value of area between ROC curves for protected_attr_col #' @references Josh Gardner, Christopher Brooks, and Ryan Baker. (2019). Evaluating the Fairness of Predictive Student Models Through Slicing Analysis. *Proceedings of the 9th International Conference on Learning Analytics and Knowledge (LAK19)*.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
# The compute_abroca function uses a dataframe of predictions to generate
# the abroca statistic. This is the main utility of the abroca package.

# First, we load data, train a model, and generate predictions to evaluate.
data("recidivism")
recidivism$returned = as.factor(recidivism$Return.Status != "Not Returned")
in_train = caret::createDataPartition(recidivism$returned, 
    p = 0.75, list = FALSE)
traindata = recidivism[in_train,c("Release.Year", "County.of.Indictment", 
    "Gender", "Age.at.Release", "returned")]
testdata = recidivism[-in_train,c("Release.Year", "County.of.Indictment", 
    "Gender", "Age.at.Release", "returned")]
lr = glm(returned ~ ., data=traindata, family="binomial")
testdata$pred = predict(lr, testdata, type = "response")

# The predictions are used as the primary input to compute_abroca():
abroca <- compute_abroca(testdata, pred_col = "pred", label_col = "returned", 
    protected_attr_col = "Gender", majority_protected_attr_val = "MALE", 
    plot_slices = FALSE, identifier="recidivism") 

jpgard/abroca documentation built on May 25, 2019, 11:31 p.m.