get.nearest: Find distance from each point in a set to the nearest of a...

View source: R/get.nearest.R

get.nearestR Documentation

Find distance from each point in a set to the nearest of a second set of points (by lat/lon).

Description

get.nearest returns the distance from each point in a set to the nearest of a second set of points (by lat/lon).

Usage

get.nearest(
  frompoints,
  topoints,
  units = "miles",
  ignore0 = FALSE,
  return.rownums = FALSE,
  return.latlons = FALSE,
  radius = Inf
)

Arguments

frompoints

A matrix or data.frame with two cols, 'lat' and 'lon' (or only 2 cols that are lat and lon in that order) with datum=WGS84 assumed.

topoints

A matrix or data.frame with two cols, 'lat' and 'lon' (or only 2 cols that are lat and lon in that order) with datum=WGS84 assumed.

units

A string that is 'miles' by default, or 'km' for kilometers, specifying units for distances returned.

ignore0

A logical, default is FALSE, specifying whether to ignore distances that are zero and report only the minimum nonzero distance. Useful if nearest point other than self, where frompoints=topoints, for example.

return.rownums

Logical value, TRUE by default. If TRUE, value returned also includes these 2 columns: a col named fromrow of index numbers starting at 1 specifying the frompoint and a similar col named n specifying the row of the nearest topoint.

return.latlons

Logical value, FALSE by default. If TRUE, value returned also includes four extra columns, showing fromlat, fromlon, tolat, tolon.

radius

Optional number, default is Inf. Distance within which search should be limited, or max distance that will be returned.

Details

This function returns a vector of distances, which are the distances from one set of points to the nearest single member (if any) of another set of points. Points are specified using latitude and longitude in decimal degrees. Relies on the sp package for the spDistsN1 and SpatialPoints functions.

A future version may use get.distances.all() but for performance only use it for distance pairs (pairs of points) that have been initially quickly filtered using lat/lon to be not too far, in an attempt to go much faster in an initial pass. *** old get.nearest with loops takes 42 seconds vs 3 seconds for this version, for 100k frompoints and 100 topoints: Sys.time(); x=get.nearest(t100k, t100); Sys.time() > Sys.time(); x=get.nearest(testpoints(1e6), testpoints(100)); Sys.time()
[1] 14:33:05 EDT
[1] 14:33:33 EDT <30 seconds from 1 mill to 100 points, as in finding nearest of 100 sites for 9 But R hung/crashed on 11mill frompoints – Probably out of memory. *** Need to break it up into batches of maybe 1 to 100 million distances at a time? There are 11,078,297 blocks according to http://www.census.gov/geo/maps-data/data/tallies/national_geo_tallies.html

Value

By default, returns a vector of distances, but can return a matrix of numbers, with columns that can include fromrow and torow indexing which is nearest (first if >1 match) of topoints, fromlat, fromlon, tolat, tolon, and d (distance). ** Returns Inf when no topoints are found within the radius, and also when a distance to nearest is zero but ignore0=TRUE. Distance returned is in miles by default, but with option to set units='km' to get kilometers. See parameters for details on other formats that may be returned if specified.

See Also

get.distances which gets distances between all points (within an optional search radius), get.distances.all which allows you to get distances between all points, get.distances.prepaired for finding distances when data are already formatted as pairs of points, and proxistat which calculates a proximity score for each spatial unit based on distances to nearby points.

Examples

set.seed(999)
t1=testpoints(1)
t10=testpoints(10)
t100=testpoints(100)
t1k=testpoints(1e3)
t10k=testpoints(1e4)
t100k=testpoints(1e5)
t1m=testpoints(1e6)
#t10m=testpoints(1e7)

get.nearest(t1, t1)
get.nearest(t1, t10[2, ,drop=FALSE])
get.nearest(t10, t1k)
get.nearest(t10, t1k, radius=500, units='km')
get.nearest(t10, t1k, radius=10, units='km')

test.from <- structure(list(fromlat = c(38.9567309094, 38.9507043428), 
 fromlon = c(-77.0896572305, -77.0896199948)), .Names = c("lat", "lon"), 
 row.names = c("6054762", "6054764"), class = "data.frame")
test.to <- structure(list(tolat = c(38.9575019287, 38.9507043428, 38.9514152435), 
 tolon = c(-77.0892818598, -77.0896199948, -77.0972395245)), .Names = c("lat", "lon"),
 class = "data.frame", row.names = c("6054762", "6054763", "6054764"))
get.nearest(test.from, test.to)
get.nearest(testpoints(10), testpoints(30))

ejanalysis/proxistat documentation built on Jan. 1, 2025, 10:02 a.m.