redist.distances: Compute Distance between Partitions

View source: R/distances.R

plan_distancesR Documentation

Compute Distance between Partitions

Description

Compute Distance between Partitions

Usage

plan_distances(plans, measure = "variation of information", ncores = 1)

redist.distances(plans, measure = "Hamming", ncores = 1, total_pop = NULL)

Arguments

plans

A matrix with one row for each precinct and one column for each map. Required.

measure

String vector indicating which distances to compute. Implemented currently are "Hamming", "Manhattan", "Euclidean", and "variation of information", Use "all" to return all implemented measures. Not case sensitive, and any unique substring is enough, e.g. "ham" for Hamming, or "info" for variation of information.

ncores

Number of cores to use for parallel computing. Default is 1.

total_pop

The vector of precinct populations. Used only if computing variation of information. If not provided, equal population of precincts will be assumed, i.e. the VI will be computed with respect to the precincts themselves, and not the population.

Details

Hamming distance measures the number of different precinct assignments between plans. Manhattan and Euclidean distances are the 1- and 2-norms for the assignment vectors. All three of the Hamming, Manhattan, and Euclidean distances implemented here are not invariant to permutations of the district labels; permuting will cause large changes in measured distance, and maps which are identical up to a permutation may be computed to be maximally distant.

Variation of Information is a metric on population partitions (i.e., districtings) which is invariant to permutations of the district labels, and arises out of information theory. It is calculated as

VI(\xi, \xi') = -\sum_{i=1}^n\sum_{j=1}^n pop(\xi_i \cap \xi'_j)/P (2log(pop(\xi_i \cap \xi'_j)) - log(pop(\xi_i)) - log(pop(\xi'_j)))

where \xi,\xi' are the partitions, \xi_i,\xi_j the individual districts, pop(\cdot) is the population, and P the total population of the state. VI is also expressible as the difference between the joint entropy and the mutual information (see references).

Value

distance_matrix returns a numeric distance matrix for the chosen metric.

a named list of distance matrices, one for each distance measure selected.

References

Cover, T. M. and Thomas, J. A. (2006). Elements of information theory. John Wiley & Sons, 2 edition.

Examples

data(fl25)
data(fl25_enum)

plans_05 <- fl25_enum$plans[, fl25_enum$pop_dev <= 0.05]
distances <- redist.distances(plans_05)
distances$Hamming[1:5, 1:5]


redist documentation built on June 24, 2024, 9:10 a.m.