resample: Resample In fairmodels: Flexible Tool for Bias Detection, Visualization, and Mitigation

Description

Method of bias mitigation. Similarly to `reweight` this method computes desired number of observations if the protected variable is independent from y and on this basis decides if this subgroup with certain class (+ or -) should be more or less numerous. Than performs oversampling or undersampling depending on the case. If type of sampling is set to 'preferential' and probs are provided than instead of uniform sampling preferential sampling will be performed. Preferential sampling depending on the case will sample observations close to border or far from border.

Usage

 `1` ```resample(protected, y, type = "uniform", probs = NULL, cutoff = 0.5) ```

Arguments

 `protected` factor, protected variables with subgroups as levels (sensitive attributes) `y` numeric, vector with classes 0 and 1, where 1 means favorable class. `type` character, either (default) 'uniform' or 'preferential' `probs` numeric, vector with probabilities for preferential sampling `cutoff` numeric, threshold for probabilities

Value

numeric vector of indexes

References

This method was implemented based on Kamiran, Calders 2011 https://link.springer.com/content/pdf/10.1007/s10115-011-0463-8.pdf

Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54``` ```data("german") data <- german data\$Age <- as.factor(ifelse(data\$Age <= 25, "young", "old")) y_numeric <- as.numeric(data\$Risk) -1 rf <- ranger::ranger(Risk ~., data = data, probability = TRUE, num.trees = 50, num.threads = 1, seed = 123) u_indexes <- resample(data\$Age, y = y_numeric) rf_u <- ranger::ranger(Risk ~., data = data[u_indexes, ], probability = TRUE, num.trees = 50, num.threads = 1, seed = 123) explainer_rf <- DALEX::explain(rf, data = data[, -1], y = y_numeric, label = "not_sampled") explainer_rf_u <- DALEX::explain(rf_u, data = data[, -1], y = y_numeric, label = "sampled_uniform") fobject <- fairness_check(explainer_rf, explainer_rf_u, protected = data\$Age, privileged = "old") fobject plot(fobject) p_indexes <- resample(data\$Age, y = y_numeric, type = "preferential", probs = explainer_rf\$y_hat) rf_p <- ranger::ranger(Risk ~., data = data[p_indexes, ], probability = TRUE, num.trees = 50, num.threads = 1, seed = 123) explainer_rf_p <- DALEX::explain(rf_p, data = data[, -1], y = y_numeric, label = "sampled_preferential") fobject <- fairness_check(explainer_rf, explainer_rf_u, explainer_rf_p, protected = data\$Age, privileged = "old") fobject plot(fobject) ```

fairmodels documentation built on May 31, 2021, 5:07 p.m.