dsldHunting: Confounder and Proxy Hunting

dsldCHunting and dsldOHuntingR Documentation

Confounder and Proxy Hunting

Description

Confounder hunting: searches for variables C that predict both Y and S. Proxy hunting: searches for variables O that predict S.

Usage

dsldCHunting(data,yName,sName,intersectDepth=10)
dsldOHunting(data,yName,sName)

Arguments

data

Data frame.

yName

Name of the response variable column.

sName

Name of the sensitive attribute column.

intersectDepth

Maximum size of intersection of the Y predictor set and the S predictor set

Details

dsldCHunting: The random forests function qeML:qeRF will be run on the indicated data to indicate feature importance in prediction of Y (without S) and S (without Y). Call these "important predictors" of Y and S.

Then for each i from 1 to intersectDepth, the intersection of the top i important predictors of Y and the the top i important predictors of S will be reported, thus suggesting possible confounders. Larger values of i will report more potential confounders, though including progressively weaker ones.

The analyst then may then consider omitting the variables C from models of the effect of S on Y.

Note: Run times may be long.

dsldOHunting: Factors, if any, will be converted to dummy variables, and then the Kendall Tau correlations will be calculated betwene S and potential proxy variables O, i.e. every column other than Y and S. (The Y column itself doesn't enter into computation.)

In fairness analyses, in which one desires to either eliminate or reduce the impact of S, one must consider the indirect effect of S via O. One may wish to eliminate or reduce the role of O.

Value

The function dsldCHunting returns an R list, one component for each confounder set found.

The function dsldOHunting returns an R matrix of correlations, one row for each level of S.

Author(s)

N. Matloff

Examples

  

data(lsa) 
dsldCHunting(lsa,'bar','race1')
# e.g. suggests confounders 'decile3', 'lsat'
    
data(mortgageSE)
dsldOHunting(mortgageSE,'deny','black')
# e.g. suggests using loan value and condo purchase as proxies


dsld documentation built on Sept. 14, 2024, 1:08 a.m.