Information-theoretical V-measure for Spatial Association"

1. The principle of the information-theoretical V-measure

Let us denote the area of the domain as A. Consider two different regionalizations of the domain. To make a further discussion more lucid, we will refer to the first one as a regionalization and to the second one as a partition. The regionalization $R$ divides the domain into $n$ regions $r_i \mid i = 1,\ldots,n$. The partition $Z$ divides the domain into $m$ zones $z_j \mid j = 1,\ldots,n$. Both $R$ and $Z$ are essentially integer-type vectors with equal elements.

$$ h = 1 - \sum\limits_{j=1}^m \frac{A_j}{A} \frac{S_j^R}{S^R} $$

where $S^R = - \sum\limits_{i=1}^n \frac{A_i}{A} \log\frac{A_i}{A}$, $S_j^R = - \sum\limits_{i=1}^n \frac{a_{i,j}}{A_j} \log \frac{a_{i,j}}{A_j}$, and $a_{i,j}$ represents the count of elements where $R==i$ and $Z==j$. $A_i$ is the number of elements in the vector where $R==i$, and $A_j$ is the number of elements in the vector where $Z==j$.

By swapping $R$ and $Z$, $c$ can be calculated. Finally, the v-measure can be calculated useing the below formula:

$$ V_{\beta} = \frac{(1+\beta)hc}{(\beta h) + c} $$

2. Example

install.packages("itmsa", dep = TRUE)
install.packages("gdverse", dep = TRUE)
library(itmsa)
ntds = gdverse::NTDs
ntds$incidence = sdsfun::discretize_vector(ntds$incidence, 5)
itm(incidence ~ watershed + elevation + soiltype,
    data = ntds, method = "vm")
## # A tibble: 3 × 3
##   Variable     Iv    Pv
##   <chr>     <dbl> <dbl>
## 1 watershed 0.373     0
## 2 elevation 0.365     0
## 3 soiltype  0.213     0


Try the itmsa package in your browser

Any scripts or data that you put into this service are public.

itmsa documentation built on April 4, 2025, 5:17 a.m.