mutual: Computes and decomposes the Mutual Information index

View source: R/mutual.R

mutualR Documentation

Computes and decomposes the Mutual Information index

Description

Computes and decomposes the Mutual Information index into "between" and "within" terms. The "within" terms can also be decomposed into "exclusive contributions" of segregation sources defined either by group or unit characteristics. The mathematical components required to compute each "within" term can also be displayed at the user's request. The results can be computed over subsamples defined by the user.

Usage

mutual(
  data,
  group,
  unit,
  within = NULL,
  by = NULL,
  contribution.from = NULL,
  components = FALSE,
  cores = NULL
)

Arguments

data

An object from the "data.table" and "mutual.data" classes.

group

A categorical variable name or vector of categorical variables names contained in data, or also, a column number or vector of column numbers of data. Defines the first dimension over which segregation is computed.

unit

A categorical variable name or vector of categorical variables names contained in data, or also, a column number or vector of column numbers of data. Defines the second dimension over which segregation is computed.

within

A categorical variable name or vector of categorical variables names contained in data, or also, a column number or vector of column numbers of data. Defines the partitions to compute the between and within decompositions. By default is NULL.

by

A categorical variable name or vector of categorical variables names contained in data, or also, a column number or vector of column numbers of data. Defines the subsamples over which indexes are computed. By default is NULL.

contribution.from

A variable of character type that can be 'group_vars' or 'unit_vars', or also, a categorical variable name or vector of categorical variables names contained in the group parameter or unit parameter, or also, a column number or vector of column numbers in the group parameter or the unit parameter. Defines the segregation sources whose exclusive contributions to the "within" terms and the overall index are computed. By default is NULL.

components

A boolean value. If TRUE and the within option is not NULL and the by option is NULL, then it returns a list where the first element is a data.table that contains a summary of the index total value and its decompositions, while the second element is a data.table with more detailed information of the decomposition of the "within" term (the mathematical components required to compute the within terms). If the within and by options are not NULL, then the function returns a list of lists where each first element is a data.table that contains the summary of the index total value and decompositions in a given subsample, while each second element is a data.table with more detailed information of the decomposition of the within term displayed in each first element in the same subsample. By default is FALSE.

cores

A positive integer. Defines the amount of CPU cores to use in parallelization tasks. If NULL, then the computation is carried out in only one core. This option is available to Mac, Linux, Unix, and BSD systems but is not available to Windows sytems. By default is NULL.

Details

Mixing group variables with unit variables in contribution.from will produce an error.

Value

A data.table if the components option is FALSE; a list if the components option is TRUE, the within option is not NULL and the by option is NULL; or a list of lists if the components option is TRUE, and both within and by options are not NULL.

References

Frankel, D. and Volij, O. (2011). Measuring school segregation. Journal of Economic Theory, 146(1):1-38. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1016/j.jet.2010.10.008")}.

Guinea-Martin, D., Mora, R., & Ruiz-Castillo, J. (2018). The evolution of gender segregation over the life course. American Sociological Review, 83(5), 983-1019. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1177/0003122418794503")}.

Mora, R. and Guinea-Martin, D. (2021). Computing decomposable multigroup indexes of segregation. UC3M Working papers, Economics 31803. Universidad Carlos III de Madrid. Departamento de Economía.

Mora, R. and Ruiz-Castillo, J. (2011). Entropy-based segregation indices. Sociological Methodology, 41(1):159-194. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1111/j.1467-9531.2011.01237.x")}.

Theil, H. and Finizza, A. J. (1971). A note on the measurement of racial integration of schools by means of informational concepts. The Journal of Mathematical Sociology, 1(2):187-193. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1080/0022250X.1971.9989795")}.

Examples

# To compute the overall measure of school segregation by socioeconomic and ethnic status.
mutual(data = DT_test, group = c("csep", "ethnicity"), unit = "school")

# Computation of the exclusive effect of specific segregation sources on the overall measure, e.g.,
# socioeconomic and ethnic contributions, and the contribution that cannot be attributed to any of
# them (the "interaction" term).
mutual(data = DT_test, group = c("csep", "ethnicity"), unit = "school", by = "region",
contribution.from = "group_vars")

# For more information on the package, refer to the manual and the README file.


mutualinf documentation built on June 22, 2024, 12:21 p.m.