get.distances: Find distances between nearby points, within specified...

View source: R/get.distances.R

get.distancesR Documentation

Find distances between nearby points, within specified radius.

Description

WORK IN PROGRESS. Returns the distances from one set of points to nearby members of another set of points.

Usage

get.distances(
  frompoints,
  topoints,
  radius = 5200,
  units = "miles",
  ignore0 = FALSE,
  dfunc = "sp",
  as.df = FALSE,
  return.rownums = TRUE,
  return.latlons = FALSE,
  return.crosstab = FALSE,
  tailored.deltalon = FALSE
)

Arguments

frompoints

A matrix or data.frame with two cols, 'lat' and 'lon' with datum=WGS84 assumed.

topoints

A matrix or data.frame with two cols, 'lat' and 'lon' with datum=WGS84 assumed.

radius

A single number defining nearby, the maximum distance searched for or recorded. Default is max allowed... radius must be less than about 8,368 kilometers (5,200 miles, or the distance from Hawaii to Maine)

units

A string that is 'miles' by default, or 'km' for kilometers, specifying units for radius and distances returned.

ignore0

A logical, default is FALSE, specifying whether to ignore distances that are zero and report only nonzero distances. Useful if want distance to points other than self, where frompoints=topoints, for example. Ignored if return.crosstab = TRUE.

dfunc

Optional character element "hf" or "slc" to specify distance function Haversine or spherical law of cosines. If "sp" (default), it uses the sp package to find distances more accurately and more quickly.

as.df

Optional logical, default is TRUE

return.rownums

Logical value, TRUE by default. If TRUE, value returned also includes two extra columns: a col of index numbers starting at 1 specifying the frompoint and a similar col specifying the topoint.

return.latlons

Logical value, FALSE by default. If TRUE, value returned also includes four extra columns, showing fromlat, fromlon, tolat, tolon.

return.crosstab

Logical value, FALSE by default. If TRUE, value returned is a matrix of the distances, with a row per frompoint and col per topoint. (Distances larger than max search radius are not provided, even in this format).

tailored.deltalon

Logical value, FALSE by default, but ignored. Leftover from older get.distances function. Defined size of initially searched area as function of lat, for each frompoint, rather than initially searching a conservatively large box.

Details

This function returns a matrix or vector of distances, which are the distances from one set of points to the nearby members of another set of points. It searches within a circle (of radius = radius, defining what is considered nearby), to calculate distance (in miles or km) from each of frompoints to each of topoints that is within the specified radius. Points are specified using latitude and longitude in decimal degrees.

Uses get.distances.all. Relies on the sp package for the spDistsN1 and SpatialPoints functions.

Regarding distance calculation, also see http://en.wikipedia.org/wiki/Vincenty%27s_formulae, http://williams.best.vwh.net/avform.htm#Dist, http://sourceforge.net/projects/geographiclib/, and http://www.r-bloggers.com/great-circle-distance-calculations-in-r/.

Finding distance to all of the 11 million census blocks in usa within 5 km, for 100 points, can take a while. May want to look at js library like turf, or investigate using data.table to index and more quickly subset the (potentially 11 million Census blocks of) topoints (or pre-index that block point dataset and allow this function to accept a data.table as input).

Value

By default, returns a dataframe that has 3 columns: fromrow, torow, distance (where fromrow or torow is the row number of the corresponding input, starting at 1). Distance returned is in miles by default, but with option to set units='km' to get kilometers. See parameters for details on other formats that may be returned if specified.

See Also

get.distances.all which allows you to get distances between all points, get.distances.prepaired for finding distances when data are already formatted as pairs of points, get.nearest which finds the distance to the single nearest point within a specified search radius instead of all topoints, and proxistat which calculates a proximity score for each spatial unit based on distances to nearby points.

Examples

#
set.seed(999)
t1=testpoints(1)
t10=testpoints(10)
t100=testpoints(100,  minlat = 25, maxlat = 45, minlon = -100, maxlon = -60)
t1k=testpoints(1e3,   minlat = 25, maxlat = 45, minlon = -100, maxlon = -60)
t10k=testpoints(1e4)
t100k=testpoints(1e5)
t1m=testpoints(1e6)
#t10m=testpoints(1e7)
   test.from <- structure(list(fromlat = c(38.9567309094, 45), 
     fromlon = c(-77.0896572305, -100)), .Names = c("lat", "lon"), 
     row.names = c("1", "2"), class = "data.frame")
    
   test.to <- structure(list(tolat = c(38.9575019287, 38.9507043428, 45), 
    tolon = c(-77.0892818598, -77.2, -90)), 
    .Names = c("lat", "lon"), class = "data.frame", 
    row.names = c("1", "2", "3"))
    
   #*** Can fail if radius=50 miles? ... Error in rbind() numbers of
   #  columns of arguments do not match !
   #big = get.distances(t100, t1k, radius=100, units='miles', return.latlons=TRUE, as.df=TRUE) 
     head(big)
     #summary(big$d)
   big = get.distances(t100, t1k, radius=100, units='miles', return.latlons=TRUE, as.df=TRUE) 
     head(big)
     summary(big$d)
   
   # see as map of many points
    plot(big$fromlon, big$fromlat,main='from black circles... 
      closest is red, others nearby are green ')
    points(t1k$lon, t1k$lat, col='blue',pch='.')
    points(big$tolon, big$tolat, col='green')
   junk=as.data.frame( get.nearest(t100, t1k, return.latlons=TRUE) )
   points(junk$tolon, junk$tolat, col='red')
   # Draw lines from frompoint to nearest:
   with(junk,linesegments(fromlon, fromlat, tolon, tolat) )
   
    # more test cases
 length(get.distances(t10,t10,radius=4999,ignore0 = TRUE, units='km')$d)
 get.distances(t10,t10,radius=4999,ignore0 = TRUE, units='km')
get.distances(test.from[1,],test.to[1,],radius=3000,return.rownums=F,return.latlons=F)
get.distances(test.from[1,],test.to[1,],radius=3000,return.rownums=FALSE,return.latlons=TRUE)
get.distances(test.from[1,],test.to[1,],radius=3000,return.rownums=TRUE,return.latlons=FALSE)
get.distances(test.from[1,],test.to[1,],radius=3000,return.rownums=TRUE,return.latlons=TRUE)
 
get.distances(test.from[1,],test.to[1:3,],radius=3000,return.rownums=F,return.latlons=F)
get.distances(test.from[1,],test.to[1:3,],radius=3000,return.rownums=FALSE,return.latlons=TRUE)
get.distances(test.from[1,],test.to[1:3,],radius=3000,return.rownums=TRUE,return.latlons=FALSE)
get.distances(test.from[1,],test.to[1:3,],radius=3000,return.rownums=TRUE,return.latlons=TRUE)
 
get.distances(test.from[1:2,],test.to[1,],radius=3000,return.rownums=F,return.latlons=F)
get.distances(test.from[1:2,],test.to[1,],radius=3000,return.rownums=FALSE,return.latlons=TRUE)
get.distances(test.from[1:2,],test.to[1,],radius=3000,return.rownums=TRUE,return.latlons=FALSE)
get.distances(test.from[1:2,],test.to[1,],radius=3000,return.rownums=TRUE,return.latlons=TRUE)
 
get.distances(test.from[1:2,],test.to[1:3,],radius=3000,return.rownums=F,return.latlons=F)
get.distances(test.from[1:2,],test.to[1:3,],radius=3000,return.rownums=FALSE,return.latlons=T)
get.distances(test.from[1:2,],test.to[1:3,],radius=3000,return.rownums=TRUE,return.latlons=F)
get.distances(test.from[1:2,],test.to[1:3,],radius=3000,return.rownums=TRUE,return.latlons=TRUE)
get.distances(test.from[1:2,],test.to[1:3,], radius=0.7,return.rownums=TRUE,
  return.latlons=TRUE, units='km')
get.distances(test.from[1:2,],test.to[1:3,], radius=0.7,return.rownums=TRUE,
  return.latlons=TRUE, units='miles')

  # Warning messages:
  # Ignoring return.crosstab because radius was specified
get.distances(test.from[1,],test.to[1:3, ], return.crosstab=TRUE)
get.distances(test.from[1:2,],test.to[1, ], return.crosstab=TRUE)
get.distances(test.from[1:2,],test.to[1:3, ], return.crosstab=TRUE)
get.distances(test.from[1:2,],test.to[1:3, ], radius=0.7, return.crosstab=TRUE)

ejanalysis/proxistat documentation built on April 2, 2024, 10:13 a.m.