View source: R/GeoNeighIndex.R
| GeoNeighIndex | R Documentation |
The function returns nearest-neighbour pair indices for spatial, spatio-temporal, or bivariate data. Optionally, a stochastic thinning mechanism can be applied to retain only a subset of the candidate nearest-neighbour pairs.
GeoNeighIndex(coordx, coordy=NULL, coordz=NULL, coordt=NULL,
coordx_dyn=NULL, distance="Eucl", neighb=4,
maxdist=NULL, maxtime=1, radius=1,
bivariate=FALSE, p_neighb=1,
thin_method="bernoulli")
coordx |
A numeric ( |
coordy |
A numeric vector giving one dimension of spatial coordinates; optional argument,
default is |
coordz |
A numeric vector giving one dimension of spatial coordinates; optional argument,
default is |
coordt |
A numeric vector giving the temporal coordinates. Optional argument,
default is |
coordx_dyn |
A list of numeric coordinate matrices containing spatial coordinates that may vary
over time. For spatio-temporal data, the list length must equal the number of time points. For
bivariate data with different spatial supports, the list must have length two, with one coordinate
matrix for each variable. Optional argument, default is |
distance |
String; the name of the spatial distance. Default is |
neighb |
Numeric; a positive integer indicating the nearest-neighbour order. In the bivariate case, it may also be a vector of length three, corresponding to within-variable 1, cross-variable, and within-variable 2 neighbourhood sizes. |
maxdist |
A numeric value denoting the maximum spatial distance; see Details. In the bivariate case, it may also be a vector of length three, corresponding to within-variable 1, cross-variable, and within-variable 2 distance thresholds. |
maxtime |
A numeric value denoting the maximum temporal distance; see Details. |
radius |
Numeric; a value indicating the radius of the sphere when using great-circle distances.
Default value is |
bivariate |
Logical; if |
p_neighb |
Numeric; a value in |
thin_method |
String; stochastic thinning scheme. Available options are |
The function first builds a candidate set of directed nearest-neighbour pairs. For purely spatial data,
the candidate set contains spatial nearest-neighbour pairs. For spatio-temporal data, the function includes
within-time spatial pairs, pure temporal same-site pairs, and cross-time spatio-temporal pairs up to
maxtime. For bivariate data, the function includes within-variable and cross-variable pairs.
If thin_method="bernoulli" and p_neighb<1, candidate pairs are retained independently with
calibrated Bernoulli probabilities. These probabilities may depend on pair features, such as spatial or
temporal lag, but they are calibrated so that the expected number of retained pairs is approximately
p_neighb times the number of candidate pairs.
If thin_method="TargetBalanced", the function applies a hard-core greedy TargetBalanceding procedure. A random
permutation of the candidate pairs is scanned, and a pair is retained only if neither endpoint has already
been used by a previously retained pair. Therefore no observation index is used in more than one retained
pair. In this case p_neighb is not a marginal inclusion probability. It defines the nominal target
round(p_neighb d), where d is the number of candidate pairs, but the final number of retained
pairs is bounded above by \lfloor n/2 \rfloor, where n is the number of observation indices,
and may be smaller due to TargetBalanceding feasibility.
If thin_method="bernoulli" and p_neighb=1, no thinning is applied. If
thin_method="TargetBalanced" and p_neighb=1, the function attempts to retain as many pairs as allowed
by the hard-core TargetBalanceding constraint; this is not equivalent to no thinning.
Returns a list containing some of the following components:
colidx |
Vector of neighbour indices. |
rowidx |
Vector of target indices. |
lags |
Vector of spatial distances. |
lagt |
Vector of temporal distances, returned for spatio-temporal data. |
first |
Variable indicator for the first component of a bivariate pair, returned for bivariate data. |
second |
Variable indicator for the second component of a bivariate pair, returned for bivariate data. |
maxdist |
Maximum spatial distance used to construct the candidate pairs, when available. |
neighb |
Nearest-neighbour order used to construct the candidate pairs, when available. |
n_candidates |
Number of candidate pairs before thinning. |
n_retained |
Number of pairs retained after thinning or TargetBalanceding. |
target_retained |
Target number of retained pairs. For Bernoulli thinning this is the expected retained count. For hard-core TargetBalanceding this is the capped target count. |
target_retained_raw |
Uncapped target number of retained pairs, returned for hard-core TargetBalanceding. |
TargetBalanceding_cap |
Maximum number of endpoint-disjoint pairs, returned for hard-core TargetBalanceding. |
effective_fraction |
Observed retained fraction, |
expected_retained |
Expected retained count under calibrated Bernoulli thinning. |
thin_method |
Thinning method used. |
p_neighb_interpretation |
Text description of how |
Moreno Bevilacqua, moreno.bevilacqua89@gmail.com, https://sites.google.com/view/moreno-bevilacqua/home, Victor Morales Onate, victor.morales@uv.cl, https://sites.google.com/site/moralesonatevictor/, Christian Caamano-Carrillo, chcaaman@ubiobio.cl, https://www.researchgate.net/profile/Christian-Caamano
require(GeoModels)
NN <- 400
coords <- cbind(runif(NN), runif(NN))
corrmodel <- "Matern"
scale <- 0.5/3
param <- list(mean=0, sill=1, nugget=0, scale=scale, smooth=0.5)
set.seed(951)
data <- GeoSim(coordx=coords, corrmodel=corrmodel,
model="Gaussian", param=param)$data
sel <- GeoNeighIndex(coordx=coords, neighb=5)
data1 <- data[sel$colidx]
data2 <- data[sel$rowidx]
## plotting pairs that are neighbours of order 5
plot(data1, data2, xlab="", ylab="",
main="h-scatterplot, neighb=5")
## Bernoulli thinning: p_neighb controls the expected retained fraction
sel_ber <- GeoNeighIndex(coordx=coords, neighb=5,
p_neighb=0.2,
thin_method="bernoulli")
data1 <- data[sel_ber$colidx]
data2 <- data[sel_ber$rowidx]
## plotting a random fraction of pairs that are neighbours of order 5
plot(data1, data2, xlab="", ylab="",
main="h-scatterplot, neighb=5")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.