CompareOneToOne-function: Compare a pair of spatial histograms

CompareOneToOne-functionR Documentation

Compare a pair of spatial histograms

Description

This function computes the Kantorovich-Wasserstein between a pair of spatial histograms defined over the same grid map.

The grid map is described by the two lists of N coordinates Xs and Ys, which specify the coordinates of the centroid of each tile of the map. For each tile i with coordinates Xs[i], Ys[i], we have the two lists of weights, one for the first histograms and the other for the second histogram.

The two lists of coordinates are passed to compareOneToOne as a matrix with N rows and two columns. The two lists of weights are passed as a matrix with N rows and two columns, a column for each histogram.

Usage

compareOneToOne(Coordinates, Weights, L = 3, recode = TRUE,
           method = "approx",    algorithm = "colgen",
           model="mincostflow",  verbosity = "silent",
           timelimit = 14400,    opt_tolerance = 1e-06,
           unbalanced = FALSE, unbal_cost = 1e+09, convex = TRUE)

Arguments

Coordinates

A Matrix with N rows and two columns:

  • Coordinates[,1]: (First Column) Vector of horizontal coordinates of the centroids of each tile of the map (Xs). Data type: vector of positive integers.

  • Coordinates[,2]: (Second Column) Vector of vertical coordinates of the centroids of each tile of the map (Ys). Data type: vector of positive integers.

Weights

A Matrix of positive weights of the tiles specified by Coordinates.

  • Weights[,1]: (First Column) Weights of the first spatial histogram, a weight for each tile located at position Xs[i], Ys[i] for i=1,...N. Data type: vector of positive doubles.

  • Weights[,2]: (Second Column) Weights of the second spatial histogram, a weight for each tile located at position Xs[i], Ys[i] for i=1,...N. Data type: vector of positive doubles.

L

Approximation parameter. Higher values of L give a more accurate solution, but they require a longer running time. Data type: positive integer.

recode

If equal to True, recode the input coordinates as consecutive integers.

method

Method for computing the KW distances: exact or approx.

algorithm

Algorithm for computing the KW distances: fullmodel or colgen.

model

Model for building the underlying network: bipartite or mincostflow.

verbosity

Level of verbosity of the log: silent, info, or debug.

timelimit

Time limit in second for running the solver.

opt_tolerance

Numerical tolerance on the negative reduced cost for the optimal solution.

unbalanced

If equal to True, solve the problem with unbalanced masses.

unbal_cost

Cost for the arcs going from each point to the extra artificial bin.

convex

If equal to True, compute the convex hull of the input points.

Details

The function compareOneToOne(Coordinates, Weights, ...) computes the distance between the two histograms specified by the weights given in the two columns of matrix Weights. The support points (i.e., centroids of each tile of the map) are defined by the coordinates given in Xs and Ys in the two columns of matrix Coordinates. The algorithm used to compute such distance depends on the parameters specified as optional arguments of the function.

The most important is the parameter L, which by default is equal to 3. The following table shows the worst-case approximation ratio as a function of the value assigned to L. The table also reports the number of arcs in the network flow model as a function of the number of bins n contained in the convex hull of the support points of the histograms given in input with matrix Coordinates.

L 1 2 3 5 10 15
Worst-case error 7.61% 2.68% 1.29% 0.49% 0.12% 0.06%
Number of arcs O(8n) O(16n) O(32n) O(80n) O(256n) O(576n)

The following two figures show the network build on a grid with 8x8 nodes and using L=2 and L=3.

L=2

L=3

Value

Return an R List with the following named attributes:

  • distance: The value of the KW-distance between the two input histograms.

  • status: Status of the solver used to compute the distances.

  • runtime: Overall runtime in seconds to compute all the distances.

  • iterations: Overall number of iterations of the Network Simplex algorithm.

  • nodes: Number of nodes in the network model used to compute the distances.

  • arcs: Number of arcs in the network model used to compute the distances.

See Also

See also compareOneToMany, compareAll, focusArea, Histogram2D, and Solver.

Examples

# Define a simple example
library(SpatialKWD)

# Random coordinates
N = 90
Xs <- as.integer(runif(N, 0, 31))
Ys <- as.integer(runif(N, 0, 31))
coordinates <- matrix(c(Xs, Ys), ncol=2, nrow=N)

# Random weights
test1 <- matrix(runif(2*N, 0, 1), ncol=2, nrow=N)

# Compute distance
print("Compare one-to-one with exact algorithm:")
d <- compareOneToOne(coordinates, Weights=test1, method="exact",
                     recode=TRUE, verbosity = "info")
cat("runtime:", d$runtime, " distance:", d$distance,
    " nodes:", d$nodes, " arcs:", d$arcs, "\n")

print("Compare one-to-one with approximate algorithm:")
d <- compareOneToOne(coordinates, Weights=test1, L=2, recode=TRUE)
cat("L: 2, runtime:", d$runtime, " distance:", d$distance,
    " nodes:", d$nodes, " arcs:", d$arcs, "\n")

d <- compareOneToOne(coordinates, Weights=test1, L=3)
cat("L: 3 runtime:", d$runtime, " distance:", d$distance, "\n")

d <- compareOneToOne(coordinates, Weights=test1, L=10)
cat("L: 10, runtime:", d$runtime, " distance:", d$distance, "\n")

SpatialKWD documentation built on Dec. 9, 2022, 5:08 p.m.