decompose_divergence: Additively decompose divergence scores by and within groups
In arthurgailes/rsegregation: Calculate Empirical Measures of Segregation

Description Usage Arguments Details Value Note Source

View source: R/decompose.R

The Divergence Index is additively decomposable. This function allows for splitting a population into groups of observations and calculating the divergence score within those groups and between those groups.

decompose_divergence(
  dataframe,
  groupCol = NULL,
  popCol = NA,
  weightCol = NA,
  output = "scores",
  ...
)

`dataframe`	A dataframe composed of numeric/integer columns representing percentages of each population group. All columns are used in the divergence calculation except for those specified in `groupCol` and `popCol`(optional), and no other columns should be included.
`groupCol`	Name of the column(s) in the dataframe used for grouping. if passing a `grouped_df` to `dataframe`, this parameter is ignored. If using multiple groups, divergence will be aggregated by all unique combinations of all groups, and compared to the total datafame
`popCol`	Either NA (default), which sets the population of each row to 1, or a character string of the column name in `dataframe`.
`weightCol`	alias for popCol
`output`	Any of: "scores" Default. The individual within and between divergence scores for each row or group, plus the total score. "percentage" One row for each entry(or group) as in "scores," but scaled so each observation reports a percentage of the total score that would be reproted with "summed". "all" The output from `summed`, `weighted`, and `percentage.`
`...`	options passed through to `divergence`

The sum of the scores reported in decompose_divergence when setting summed==TRUE should always be equal to the

Deomposing the divergence index allows users to simultatneously examine the segregation within and between groups of a large geography. Furthermore, users can assess the percentage of segregation coming from each group.

The output paramater "scaled" transforms the divergence index it from an absolute to a relative measure of inequality and segregation, and negates several of its desirable properties, including aggregation equivalence and independence. (See Roberto, 2016)

A dataframe as specified by the output parameter.

The dataframe will have three columns: 'within_divergence', equivalent to divergence() for each dataframe or group in dataframe; 'between_divergence', the divergence score of each group's demographics compared to the full population; and weightCol, the sum of the weights for each group. The sum of decompose_divergence(...,summed = T) should equal the result of divergence(...,summed = T)

The divergence parameters for each group are set to their defaults unless explicitly noted above.

decompose_divergence treats the entire dataset its given as the total population, which may not be desirable in some contexts, for example, when trying to return divergence scores across years. In that context, it's helpful to split the dataframe into a list of dataframes and use decompose_divergence inside a sapply function.

Roberto, 2016. "A Decomposable Measure of Segregation and Inequality."

arthurgailes/rsegregation documentation built on May 23, 2021, 6:33 a.m.