st_nn: Nearest Neighbor Search for Simple Features

View source: R/st_nn.R

st_nnR Documentation

Nearest Neighbor Search for Simple Features

Description

Returns the indices of layer y which are nearest neighbors of each feature of layer x. The number of nearest neighbors k and the search radius maxdist can be modified.

The function has three modes of operation:

  • lon-lat points—Calculation using C code from GeographicLib, similar to sf::st_distance

  • projected points—Calculation using nabor::knn, a fast search method based on the libnabo C++ library

  • lines or polygons, either lon-lat or projected—Calculation based on sf::st_distance

Usage

st_nn(
  x,
  y,
  sparse = TRUE,
  k = 1,
  maxdist = Inf,
  returnDist = FALSE,
  progress = TRUE,
  parallel = 1
)

Arguments

x

Object of class sf or sfc

y

Object of class sf or sfc

sparse

logical; should a sparse index list be returned (TRUE, the default) or a dense logical matrix? See "Value" section below.

k

The maximum number of nearest neighbors to compute. Default is 1, meaning that only a single point (nearest neighbor) is returned.

maxdist

Search radius (in meters). Points farther than search radius are not considered. Default is Inf, meaning that search is unconstrained.

returnDist

logical; whether to return a second list with the distances between detected neighbors.

progress

Display progress bar? The default is TRUE. When using parallel>1 or when input is projected points, a progress bar is not displayed regardless of progress argument.

parallel

Number of parallel processes. The default parallel=1 implies ordinary non-parallel processing. Parallel processing is not applicable for projected points, where calculation is already highly optimized through the use of nabor::knn. Parallel processing is done with the parallel package.

Value

  • If sparse=TRUE (the default), a sparse list with list element i being a numeric vector with the indices j of neighboring features from y for the feature x[i,], or an empty vector (integer(0)) in case there are no neighbors.

  • If sparse=FALSE, a logical matrix with element [i,j] being TRUE when y[j,] is a neighbor of x[i].

  • If returnDists=TRUE the function returns a list, with the first element as specified above, and the second element a sparse list with the distances (as units vectors, in meters) between each pair of detected neighbors corresponding to the sparse list of indices.

References

C. F. F. Karney, GeographicLib, Version 1.49 (2017-mm-dd), https://geographiclib.sourceforge.io/1.49/

Examples

data(cities)
data(towns)

cities = st_transform(cities, 32636)
towns = st_transform(towns, 32636)
water = st_transform(water, 32636)

# Nearest town
st_nn(cities, towns, progress = FALSE)

# Using 'sfc' objects
st_nn(st_geometry(cities), st_geometry(towns), progress = FALSE)
st_nn(cities, st_geometry(towns), progress = FALSE)
st_nn(st_geometry(cities), towns, progress = FALSE)

# With distances
st_nn(cities, towns, returnDist = TRUE, progress = FALSE)

## Not run: 

# Distance limit
st_nn(cities, towns, maxdist = 7200)
st_nn(cities, towns, k = 3, maxdist = 12000)
st_nn(cities, towns, k = 3, maxdist = 12000, returnDist = TRUE)

# 3 nearest towns
st_nn(cities, towns, k = 3)

# Spatial join
st_join(cities, towns, st_nn, k = 1)
st_join(cities, towns, st_nn, k = 2)
st_join(cities, towns, st_nn, k = 1, maxdist = 7200)
st_join(towns, cities, st_nn, k = 1)

# Polygons to polygons
st_nn(water, towns, k = 4)

# Large example - Geo points
n = 1000
x = data.frame(
  lon = (runif(n) * 2 - 1) * 70,
  lat = (runif(n) * 2 - 1) * 70
)
x = st_as_sf(x, coords = c("lon", "lat"), crs = 4326)
start = Sys.time()
result1 = st_nn(x, x, k = 3)
end = Sys.time()
end - start

# Large example - Geo points - Parallel processing
start = Sys.time()
result2 = st_nn(x, x, k = 3, parallel = 4)
end = Sys.time()
end - start
all.equal(result1, result2)

# Large example - Proj points
n = 1000
x = data.frame(
  x = (runif(n) * 2 - 1) * 70,
  y = (runif(n) * 2 - 1) * 70
)
x = st_as_sf(x, coords = c("x", "y"), crs = 4326)
x = st_transform(x, 32630)
start = Sys.time()
result = st_nn(x, x, k = 3)
end = Sys.time()
end - start

# Large example - Polygons
set.seed(1)
n = 150
x = data.frame(
  lon = (runif(n) * 2 - 1) * 70,
  lat = (runif(n) * 2 - 1) * 70
)
x = st_as_sf(x, coords = c("lon", "lat"), crs = 4326)
x = st_transform(x, 32630)
x = st_buffer(x, 1000000)
start = Sys.time()
result1 = st_nn(x, x, k = 3)
end = Sys.time()
end - start

# Large example - Polygons - Parallel processing
start = Sys.time()
result2 = st_nn(x, x, k = 3, parallel = 4)
end = Sys.time()
end - start
all.equal(result1, result2)


## End(Not run)

nngeo documentation built on April 25, 2023, 1:10 a.m.