adversarial_debiasing: Adversarial Debiasing
In aif360: Help Detect and Mitigate Bias in Machine Learning Models

Description Usage Arguments Examples

View source: R/inprocessing_adversarial_debiasing.R

Adversarial debiasing is an in-processing technique that learns a classifier to maximize prediction accuracy and simultaneously reduce an adversary's ability to determine the protected attribute from the predictions

adversarial_debiasing(
  unprivileged_groups,
  privileged_groups,
  scope_name = "current",
  sess = tf$compat$v1$Session(),
  seed = NULL,
  adversary_loss_weight = 0.1,
  num_epochs = 50,
  batch_size = 128,
  classifier_num_hidden_units = 200,
  debias = TRUE
)

`unprivileged_groups`	a list with two values: the column of the protected class and the value indicating representation for unprivileged group
`privileged_groups`	a list with two values: the column of the protected class and the value indicating representation for privileged group
`scope_name`	scope name for the tensorflow variables
`sess`	tensorflow session
`seed`	seed to make 'predict' repeatable.
`adversary_loss_weight`	hyperparameter that chooses the strength of the adversarial loss.
`num_epochs`	number of training epochs.
`batch_size`	batch size.
`classifier_num_hidden_units`	number of hidden units in the classifier model.
`debias`	learn a classifier with or without debiasing.

load_aif360_lib()
ad <- adult_dataset()
p <- list("race", 1)
u <- list("race", 0)

sess = tf$compat$v1$Session()

plain_model = adversarial_debiasing(privileged_groups = p,
                                    unprivileged_groups = u,
                                    scope_name='plain_classifier',
                                    debias=FALSE,
                                    sess=sess)

plain_model$fit(ad)
ad_nodebiasing <- plain_model$predict(ad)