spatialCorrForPoints: Calculate spatial autocorrelation for geographic points
In adamlilith/enmSdm: Tools for Modeling Niches and Distributions of Species

spatialCorrForPoints

R Documentation

Calculate spatial autocorrelation for geographic points

Description

This function calculates a measure of spatial autocorrelation for a set of geographic points. The procedure first computes the distribution of pairwise distances between the points. The frequency of distances are then tabulated in overlapping distance bins (e.g., 0 to 1000 m, 500 to 1500 m, 1000 to 2000 m, etc.). Then, many sets of randomly located points are generated. Each set contains the same number of points as observed. For each set pairwise distances are calculated and the distance distribution tabulated. The observed distance distribution can then be compared to the randomized distributions to determine at what distance(s) the observed distances are more/less clustered than expected by chance.

Usage

spatialCorrForPoints(
  pts,
  rast,
  breaks = 20,
  iters = 100,
  fixed = NULL,
  verbose = FALSE,
  ...
)

Arguments

`pts`	Either a SpatVector, SpatialPoints, or SpatialPointsDataFrame object, or a matrix or data frame with two columns where the first column has values for longitude and the second latitude.
`rast`	Either a SpatRster, raster, raster stack, or raster brick. Randomly located sites will be placed in any non-`NA` cell on this raster. If this is a raster stack or brick, the first layer will be used.
`breaks`	Three numeric values or a matrix or data frame with at least two columns: Single integer: The number of overlapping bins into which to enumerate values of `x`. The range of `x` covered by the bins bins will extend from 0 to the largest value plus 2.5 percent of the range of pairwise distances. Three numeric values: The first two values are smallest and largest distances (in units used in coordinate reference system of `rast`, typically meters) across which to tabulate distance frequencies. The third value is the number of bins. Matrix or data frame with at least two columns. Each row corresponds to a different bin. The first column represents the minimum distance in each bin (in units used in coordinate reference system of `rast`, typically meters) and the second column the maximum distance. Subsequent columns are ignored. Note that by using this option arbitrary bins can be used–they need not overlap or even be continuous in coverage.
`iters`	Positive integer, number of times to generate randomized realizations of points.
`fixed`	Either `NULL` (default) or a SpatialPoints or SpatialPointsDataFrame object, or a matrix or data frame with two columns where the first column has values for longitude and the second latitude. If not `NULL`, then the "observed" pairwise distance distribution is the set of pairwise distances between `pts` and `fixed`. Only the points in `pts` are randomly re-located.
`verbose`	Logical, if `TRUE` then display progress. Default is `FALSE`.
`...`	Arguments to pass to `randomPoints`.

Details

The idea behind this measure of spatial autocorrelation is that a set of geographic points is "independent" of one another if their pairwise distances are indistinguishable from pairwise distances of the same number of points randomly located across a landscape. Typically a set of points displays clustering (non-independence) across some distances but not all distances. Thus to identify the scale of clustering pairwise distances are tabulated into bins. (We suggest using overlapping bins, e.g., from 0 to 20000 m, 10000 to 30000 m, 20000 to 40000 m, etc. See the example below for how to do this). The function spatialCorrForPoints first calculates the observed distance distribution and tabulates the frequency of distances into bins. Then, it generates a set of randomly located points equal to the same number of points as in the observed set. It then calculates the randomized distance distribution and tabulates the distances. The randomization is repeated a large number of times (the default is 100). The observed frequency of distances can be compared to the set of random distances using spatialCorrForPointsSummary and spatialCorrForPointsPlot. The default values in those functions assume that clustering occurs if the observed pairwise distance is > the 95th quantile of the null frequency distribution for that bin (i.e., a 1-tailed test), but users can specify a different percentile to demarcate significance. In practice a series of distance bins often show clustering, but the one usually of interest is the first distance bin (the one closest to 0) that has a non-significant difference between observed and expected distances. This is the characteristic diameter of a cluster of points. Points closer than this distance can be considered non-independent of one another.
Alternatively, one can specify a set of points using the fixed argument. In this case, the "observed" pairwise distance distribution is tabulated from the set of pairwise distances between the points specified by argument pts and fixed. The randomized distance distribution is calculated by randomly re-locating points in pts and calculating distances to fixed. The function spatialCorrForPointsWeight calculates weights for a set of points based on the characteristic scale of spatial autocorrelation.

Value

A data frame. The first three columns represent the lower boundary, middle, and upper boundary for each distance bin. The column named observedProportion represents the proportion of observed pairwise distances in each bin. Subsequent columns are named randProportion (one per randomization iteration) which are the proportion of pairwise distances from randomized points that fall into each bin.

Examples

## Not run: 
# create raster of Madagascar
data(mad0)
rast <- raster::raster(mad0, res=c(1/12, 1/12))
rast[] <- 1
rast <- raster::crop(rast, mad0)
mad0rast <- raster::rasterize(mad0, rast)
rast <- rast * mad0rast

# lemur point data
data(lemurs)
fulvus <- lemurs[lemurs$species == 'Eulemur fulvus', c('longitude', 'latitude')]

# create overlapping bins for tabulating pairwise distances
ext <- extent(rast)
southwest <- c(ext@xmin, ext@ymin)
northeast <- c(ext@xmax, ext@ymax)
maxDist <- geosphere::distGeo(southwest, northeast)

binLength <- 60000 # in meters
maxDist <- binLength * ceiling(maxDist / binLength)

breaks <- data.frame(
	lower=seq(0, maxDist - binLength, by=binLength / 2),
	upper=seq(binLength, maxDist, by=binLength / 2)
)

# compare observed pairwise distance distribution to null distribution
# of pairwise values from randomly located points
obsAndNullDistrib <- spatialCorrForPoints(
	pts = fulvus,
	rast = rast,
	breaks = breaks,
	iters = 100,
	verbose = TRUE
)

# summary and plot
sacDist <- spatialCorrForPointsSummary(obsAndNullDistrib)
main <- paste('Characteristic cluster size:', sacDist, 'meters')
spatialCorrForPointsPlot(obsAndNullDistrib, xlab='Distance (m)', main=main)

# calculate weights
weight <- 4 * spatialCorrForPointsWeight(x=obsAndNullDistrib, pts=fulvus)
plot(mad0, main='Point size represents weight')
points(fulvus, pch=1, cex=weight)

## End(Not run)

adamlilith/enmSdm documentation built on Jan. 6, 2023, 11 a.m.