# CompareOneToOne-function: Compare a pair of spatial histograms In SpatialKWD: Spatial KWD for Large Spatial Maps

## Description

This function computes the Kantorovich-Wasserstein between a pair of spatial histograms defined over the same grid map.

The grid map is described by the two lists of `N` coordinates `Xs` and `Ys`, which specify the coordinates of the centroid of each tile of the map. For each tile `i` with coordinates `Xs[i], Ys[i]`, we have the two lists of weights, one for the first histograms and the other for the second histogram.

The two lists of coordinates are passed to `compareOneToOne` as a matrix with `N` rows and two columns. The two lists of weights are passed as a matrix with `N` rows and two columns, a column for each histogram.

## Usage

 ```1 2 3 4 5``` ```compareOneToOne(Coordinates, Weights, L = 3, recode = TRUE, method = "approx", algorithm = "colgen", model="mincostflow", verbosity = "silent", timelimit = 14400, opt_tolerance = 1e-06, unbalanced = FALSE, unbal_cost = 1e+09, convex = TRUE) ```

## Arguments

 `Coordinates` A `Matrix` with `N` rows and two columns: `Coordinates[,1]`: (First Column) Vector of horizontal coordinates of the centroids of each tile of the map (`Xs`). Data type: vector of positive integers. `Coordinates[,2]`: (Second Column) Vector of vertical coordinates of the centroids of each tile of the map (`Ys`). Data type: vector of positive integers. `Weights` A `Matrix` of positive weights of the tiles specified by `Coordinates`. `Weights[,1]`: (First Column) Weights of the first spatial histogram, a weight for each tile located at position `Xs[i], Ys[i]` for `i=1,...N`. Data type: vector of positive doubles. `Weights[,2]`: (Second Column) Weights of the second spatial histogram, a weight for each tile located at position `Xs[i], Ys[i]` for `i=1,...N`. Data type: vector of positive doubles. `L` Approximation parameter. Higher values of L give a more accurate solution, but they require a longer running time. Data type: positive integer. `recode` If equal to `True`, recode the input coordinates as consecutive integers. `method` Method for computing the KW distances: `exact` or `approx`. `algorithm` Algorithm for computing the KW distances: `fullmodel` or `colgen`. `model` Model for building the underlying network: `bipartite` or `mincostflow`. `verbosity` Level of verbosity of the log: `silent`, `info`, or `debug`. `timelimit` Time limit in second for running the solver. `opt_tolerance` Numerical tolerance on the negative reduced cost for the optimal solution. `unbalanced` If equal to `True`, solve the problem with unbalanced masses. `unbal_cost` Cost for the arcs going from each point to the extra artificial bin. `convex` If equal to `True`, compute the convex hull of the input points.

## Details

The function `compareOneToOne(Coordinates, Weights, ...)` computes the distance between the two histograms specified by the weights given in the two columns of matrix `Weights`. The support points (i.e., centroids of each tile of the map) are defined by the coordinates given in `Xs` and `Ys` in the two columns of matrix `Coordinates`. The algorithm used to compute such distance depends on the parameters specified as optional arguments of the function.

The most important is the parameter `L`, which by default is equal to 3. The following table shows the worst-case approximation ratio as a function of the value assigned to `L`. The table also reports the number of arcs in the network flow model as a function of the number of bins n contained in the convex hull of the support points of the histograms given in input with matrix `Coordinates`.

 L 1 2 3 5 10 15 `Worst-case error` 7.61% 2.68% 1.29% 0.49% 0.12% 0.06% `Number of arcs` O(8n) O(16n) O(32n) O(80n) O(256n) O(576n)

The following two figures show the network build on a grid with 8x8 nodes and using L=2 and L=3.

## Value

Return an R List with the following named attributes:

• `distance`: The value of the KW-distance between the two input histograms.

• `status`: Status of the solver used to compute the distances.

• `runtime`: Overall runtime in seconds to compute all the distances.

• `iterations`: Overall number of iterations of the Network Simplex algorithm.

• `nodes`: Number of nodes in the network model used to compute the distances.

• `arcs`: Number of arcs in the network model used to compute the distances.

See also `compareOneToMany`, `compareAll`, `focusArea`, `Histogram2D`, and `Solver`.
 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29``` ```# Define a simple example library(SpatialKWD) # Random coordinates N = 90 Xs <- as.integer(runif(N, 0, 31)) Ys <- as.integer(runif(N, 0, 31)) coordinates <- matrix(c(Xs, Ys), ncol=2, nrow=N) # Random weights test1 <- matrix(runif(2*N, 0, 1), ncol=2, nrow=N) # Compute distance print("Compare one-to-one with exact algorithm:") d <- compareOneToOne(coordinates, Weights=test1, method="exact", recode=TRUE, verbosity = "info") cat("runtime:", d\$runtime, " distance:", d\$distance, " nodes:", d\$nodes, " arcs:", d\$arcs, "\n") print("Compare one-to-one with approximate algorithm:") d <- compareOneToOne(coordinates, Weights=test1, L=2, recode=TRUE) cat("L: 2, runtime:", d\$runtime, " distance:", d\$distance, " nodes:", d\$nodes, " arcs:", d\$arcs, "\n") d <- compareOneToOne(coordinates, Weights=test1, L=3) cat("L: 3 runtime:", d\$runtime, " distance:", d\$distance, "\n") d <- compareOneToOne(coordinates, Weights=test1, L=10) cat("L: 10, runtime:", d\$runtime, " distance:", d\$distance, "\n") ```