geoThin: Thin geographic points (mostly) deterministically

View source: R/geoThin.r

geoThinR Documentation

Thin geographic points (mostly) deterministically


This function thins geographic points such that none have nearest neighbors closer than some user-specified distance. The results are almost deterministic (see Details).


geoThin(x, minDist, longLat = NULL, distFunct = NULL, verbose = FALSE, ...)



Data frame, matrix, SpatialPoints, or SpatialPointsDataFrame object.


Numeric. Minimum distance needed between points to retain them. Points falling < this distance will be discarded. If distFunct is distGeo then this should be in the same units as f (see link[geosphere]{distGeo} and related "dist" functions).


Two-element character list or two-element integer list. If x is a data frame then this should be a character list specifiying the names of the fields in x or a two-element list of integers that correspond to longitude and latitude (in that order). For example, c('long', 'lat') or c(1, 2). If x is a matrix then this is a two-element list indicating the column numbers in x that represent longitude and latitude (for example, c(1, 2)). If x is a SpatialPoints or a SpatialPointsDataFrame object then this argument is ignored.


Either a function or NULL. If NULL then distGeo is used to calculate distances. More accurate distances can be obtained by using other functions (see distHaversine and references therein). Alternatively, a custom function can be used so long as its first argument is a 2-column numeric matrix with one row for the x- and y-coordinates of a single point and its second argument is a two-column numeric matrix with one or more rows of other points.


Logical. If TRUE then display progress.


Arguments to pass to distFunct.


The procedure for removing points is as follows:

  • Find points with largest number of neighbors (< minDist away). If just one such point exists, remove it, but if there is more than one then...

  • Of these find the points with the closest neighbor within minDist. If just one such point exists, remove it, but if there is more than one then...

  • Of these find the point that is closest to the centroid of all non-removed points. If just one such point exists, remove it, but if there is more than one...

  • Of these find the point that has the smallest median distance to all points (even if > minDist). If just one such point exists, remove it, but if there is more than one then...

  • Of these randomly select a point and remove it.

  • Repeat.

Thus the results are deterministic up to the last tie-breaking step.


Object of class x.

See Also



# example using data frame
x <- data.frame(long=c(-90.1, -90.1, -90.15, -90.17, -90.2, -89),
   lat=c(38, 38, 38, 38, 38, 38), point=letters[1:6])
geoThin(x, minDist=500, longLat=1:2, verbose=TRUE)
geoThin(x, minDist=5000, longLat=c(1, 2), verbose=TRUE)

# example of potential randomness
geoThin(x, minDist=1000, longLat=c(1, 2))
geoThin(x, minDist=1000, longLat=c(1, 2))
geoThin(x, minDist=1000, longLat=c(1, 2))

# example using SpatialPointsDataFrame
fulvus <- lemurs[lemurs$species == 'Eulemur fulvus', c('longitude', 'latitude')]
fulvus <- sp::SpatialPointsDataFrame(
		proj4string=getCRS('wgs84', TRUE)

sp::plot(mad0, main='Madagascar')
points(fulvus, col='red')
thinned <- geoThin(fulvus, 50000)
points(thinned, pch=16)
legend('topright', legend=c('retained', 'discarded'),
col=c('black', 'red'), pch=c(16, 1))

# test to see function works when no points need removed
thinned <- geoThin(fulvus, 200, verbose=TRUE)
sp::plot(mad0, main='Madagascar')
points(fulvus, col='red')
points(thinned, pch=16)
legend('topright', legend=c('retained', 'discarded'),
col=c('black', 'red'), pch=c(16, 1))

adamlilith/enmSdm documentation built on Jan. 6, 2023, 11 a.m.