di_iterate: Iteratively calculate disproportionate impact using multiple...
In DisImpact: Calculates Disproportionate Impact When Binary Success Data are Disaggregated by Subgroups

di_iterate

R Documentation

Iteratively calculate disproportionate impact using multiple method for many variables.

Description

Iteratively calculate disproportionate impact via the percentage point gap (PPG), proportionality index, and 80% index methods for many success variables, disaggregation variables, and scenarios.

Usage

di_iterate(
  data,
  success_vars,
  group_vars,
  cohort_vars = NULL,
  scenario_repeat_by_vars = NULL,
  exclude_scenario_df = NULL,
  weight_var = NULL,
  include_non_disagg_results = TRUE,
  ppg_reference_groups = "overall",
  min_moe = 0.03,
  use_prop_in_moe = FALSE,
  prop_sub_0 = 0.5,
  prop_sub_1 = 0.5,
  di_prop_index_cutoff = 0.8,
  di_80_index_cutoff = 0.8,
  di_80_index_reference_groups = "hpg",
  check_valid_reference = TRUE,
  parallel = FALSE,
  parallel_n_cores = parallel::detectCores(),
  parallel_split_to_disk = FALSE
)

Arguments

`data`	A data frame for which to iterate DI calculations for a set of variables.
`success_vars`	A character vector of success variable names to iterate across.
`group_vars`	A character vector of group (disaggregation) variable names to iterate across.
`cohort_vars`	(Optional) A character vector of the same length as `success_vars` to indicate the cohort variable to be used for each variable specified in `success_vars`. A vector of length 1 could be specified, in which case the same cohort variable is used for each success variable. If not specified, then a single cohort is assumed for all success variables.
`scenario_repeat_by_vars`	(Optional) A character vector of variables to repeat DI calculations for across all combination of these variables. For example, the following variables could be specified: Ed Goal: Degree/Transfer, Shot-term Career, Non-credit First time college student: Yes, No Full-time status: Yes, No Each combination of these variables (eg, full time, first time college students with an ed goal of degree/transfer as one combination) would constitute an iteration / sample for which to calculate disproportionate impact for outcomes listed in `success_vars` and for the disaggregation variables listed in `group_vars`. The overall rate of success for full time, first time college students with an ed goal of degree/transfer would just include these students and not others. Each variable specified is also collapsed to an '- All' group so that the combinations also reflect all students of a particular category. The total number of combinations for the previous example would be (+1 representing the all category): (3 + 1) x (2 + 1) x (2 + 1) = 36.
`exclude_scenario_df`	(Optional) A data frame with variables that match `scenario_repeat_by_vars` for specifying the combinations to exclude from DI calculations. Following the example specified above, one could choose to exclude part-time non-credit students from consideration.
`weight_var`	(Optional) A character variable specifying the weight variable if the input data set is summarized (i.e., the the success variables specified in `success_vars` contain count of successes). Weight here corresponds to the denominator when calculating the success rate. Defaults to `NULL` for an input data set where each row describes each individual.
`include_non_disagg_results`	A logical variable specifying whether or not the non-disaggregated results should be returned; defaults to `TRUE`. When `TRUE`, a new variable `- None` is added to the data set with a single data value `'- All'`, and this variable is added `group_vars` as a disaggregation/group variable. The user would want these results returned to review non-disaggregated results.
`ppg_reference_groups`	Either `'overall'`, `'hpg'`, `'all but current'`, or a character vector of the same length as `group_vars` that indicates the reference group value for each group variable in `group_vars` when determining disproportionate impact using the percentage point gap method.
`min_moe`	The minimum margin of error to be used in the PPG calculation, passed to di_ppg.
`use_prop_in_moe`	Whether the estimated proportions should be used in the margin of error calculation by the PPG, passed to di_ppg.
`prop_sub_0`	passed to di_ppg; defaults to 0.50.
`prop_sub_1`	passed to di_ppg; defaults to 0.50.
`di_prop_index_cutoff`	Threshold used for determining disproportionate impact using the proportionality index; passed to di_prop_index; defaults to 0.80.
`di_80_index_cutoff`	Threshold used for determining disproportionate impact using the 80% index; passed to di_80_index; defaults to 0.80.
`di_80_index_reference_groups`	Either `'overall'`, `'hpg'`, `'all but current'`, or a character vector of the same length as `group_vars` that indicates the reference group value for each group variable in `group_vars` when determining disproportionate impact using the 80% index.
`check_valid_reference`	Check whether `ppg_reference_groups` and `di_80_index_reference_groups` contain valid values; defaults to `TRUE`.
`parallel`	If `TRUE`, then perform calculations in parallel based on the scenarios specified by `scenario_repeat_by_vars`. Defaults to `FALSE`. Parallel execution is based on the `parallel` package included in base R, using parLapply on Windows and mclapply on POSIX-based systems (Linux/Mac).
`parallel_n_cores`	The number of CPU cores to use if `parallel=TRUE`. Defaults to the maximum number CPU cores on the system.
`parallel_split_to_disk`	If `TRUE` and `parallel=TRUE`, then create intermediate data sets for each scenario generated by `scenario_repeat_by_vars`, write them to disk, and import the required data set when necessary for each scenario executing in parallel. This feature is useful when the data set specified by `data` is very large and parallel execution is desired for speed in order to reduce the likelihood of consuming all the system's memory and crashing. Note that there is an overhead I/O cost on speed when this feature is used. Defaults to `FALSE`.

Details

Iteratively calculate disproportionate impact via the percentage point gap (PPG), proportionality index, and 80% index methods for all combinations of success_vars, group_vars, and cohort_vars, for each combination of subgroups specified by scenario_repeat_by_vars.

Value

A summarized data set (data frame) consisting of:

success_variable (elements of success_vars),
disaggregation (elements of group_vars),
cohort (values corresponding to the variables specified in cohort_vars,
di_indicator_ppg (1 if there is disproportionate impact per the percentage point gap method, 0 otherwise),
di_indicator_prop_index (1 if there is disproportionate impact per the proportionality index, 0 otherwise),
di_indicator_80_index (1 if there is disproportionate impact per the 80% index, 0 otherwise), and
other relevant fields returned from di_ppg, di_prop_index, and di_80_index.

Examples

library(dplyr)
data(student_equity)
# Multiple group variables
di_iterate(data=student_equity, success_vars=c('Transfer')
  , group_vars=c('Ethnicity', 'Gender'), cohort_vars=c('Cohort')
  , ppg_reference_groups='overall')

DisImpact documentation built on Oct. 11, 2022, 1:06 a.m.

DisImpact index

README.md DisImpact Tutorial" Disproportionate Impact (DI) Calculations on Long, Summarized Data Sets" Example: Intersectionality" Example: Multi-Ethnicity Categorization" Scaling Disproportionate Impact (DI) Calculations for Interactive Visualizations"

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

DisImpact
Calculates Disproportionate Impact When Binary Success Data are Disaggregated by Subgroups

di_iterate: Iteratively calculate disproportionate impact using multiple...
In DisImpact: Calculates Disproportionate Impact When Binary Success Data are Disaggregated by Subgroups

Iteratively calculate disproportionate impact using multiple method for many variables.

Description

Usage

Arguments

Details

Value

Examples

Related to di_iterate in DisImpact...

R Package Documentation

Browse R Packages

We want your feedback!

DisImpact Calculates Disproportionate Impact When Binary Success Data are Disaggregated by Subgroups

di_iterate: Iteratively calculate disproportionate impact using multiple... In DisImpact: Calculates Disproportionate Impact When Binary Success Data are Disaggregated by Subgroups

Iteratively calculate disproportionate impact using multiple method for many variables.

Description

Usage

Arguments

Details

Value

Examples

Related to di_iterate in DisImpact...

R Package Documentation

Browse R Packages

We want your feedback!

DisImpact
Calculates Disproportionate Impact When Binary Success Data are Disaggregated by Subgroups

di_iterate: Iteratively calculate disproportionate impact using multiple...
In DisImpact: Calculates Disproportionate Impact When Binary Success Data are Disaggregated by Subgroups