mahalanobisDist: Calculate Squared Mahalanobis Distances

View source: R/accessory_geo.R

mahalanobisDistR Documentation

Calculate Squared Mahalanobis Distances

Description

Calculate Squared Mahalanobis Distances

Usage

mahalanobisDist(
  lon,
  lat,
  method = NULL,
  n.min = 5,
  digs = 4,
  center = "mean",
  geo = NULL,
  cult = NULL,
  geo.patt = "ok_",
  cult.patt = NA
)

Arguments

lon

numerical. Longitude in decimal degrees

lat

numerical. Latitude in decimal degrees

method

character. Type of method desired: 'classic' and/or 'robust' (see Details)

n.min

numerical. Minimum number of unique coordinates to be used in the calculations.

digs

numerical. Number of digits to be returned after the decimal point. Default to 4

center

character. Which metric should be used to obtain he center of the distribution of coordinates: 'mean' or 'median'?

geo

character. A vector of the same length of lon/lat containing the result from the validation of the geographical coordinates. Default to NULL.

cult

character. A vector of the same length of lon/lat containing the result from the validation of cultivated specimens. Default to NULL.

geo.patt

character. The pattern to be used to search for classes of geographical validation to be included in the analyses. Default to "ok_".

cult.patt

character. The pattern to be used to search for classes of validation of cultivated specimens to be included in the analyses. Default to NA.

Details

Two possible methods to calculate the Mahalanobis distances are available: the classic (method= 'classic') and the robust methods (method= 'robust'). The two methods take into account the geographical center of the coordinates distribution and the spatial covariance between them. But they vary in the way the covariance matrix of the distribution is defined: the classic method uses an approach based on Pearson's method, while the robust method uses a Minimum Covariance Determinant (MCD) estimator.

The argument n.min controls the minimum number of unique coordinates necessary to calculate the distances. The classic and robust methods needs at least 3 and 4 spatially unique coordinates to obtain the distances. But the MCD algorithm of the robust method can run into singularity issues depending on how close the coordinates are. This issue can result in the overestimation of the distances and thus in bad outlier flagging. A minimum of five and ideally 10 unique coordinates should avoid those problems.

If the MCD algorithm runs into singularity issues, the function silently add some random noise to both coordinates and re-run the MCD algorithm. This aims to deals with cases of few coordinates close to each other and in practice should not change the overall result of the detection of spatial outliers.

The presence of problematic coordinates and cultivated specimens can greatly influence the estimation of the geographical center of the coordinates distribution and the spatial covariance between them. Thus, arguments geo and cult can be used to flag and remove those cases from the computation of the center and covariance matrix. In both cases, the user can provide the output from plantR functions checkCoord() and getCult() or a logical TRUE/FALSE vector. By default, if the input are the outputs from functions checkCoord() and getCult(), only the coordinates flagged as 'ok_...' in geo and those not flagged in cult (i.e. NAs) will be used. But users can select different search patterns using the arguments geo.patt and cult.patt. For both input options, the vector must have the same length of the coordinates provided in the arguments lat and lon. By default, arguments geo and cult are set to NULL, meaning that all coordinates will be used.

The function internally removes spatially duplicated coordinates previous to the calculation of the Mahalanobis distances. So, the value in n.min correspond to the number of coordinates after the removal of spatially duplicated coordinates.

The function also internally removes any empty or NA values in lon or lat.

Value

the input data frame and a new column(s) with the distances obtained using the selected method(s)

Author(s)

Renato A. Ferreira de Lima

See Also

uniqueCoord, checkCoord, getCult

Examples

lon <- c(-42.2,-42.6,-45.3,-42.5,-42.3,-39.0,-12.2)
lat <- c(-44.6,-46.2,-45.4,-42.2,-43.7,-45.0,-8.0)

## Not run: 
# Assuming that all coordinates are valid
mahalanobisDist(lon, lat, method = "classic", n.min = 1)
mahalanobisDist(lon, lat, method = "robust", n.min = 1)

# Flagging last coordinate as problematic
mahalanobisDist(lon, lat, method = "classic", n.min = 1,
geo = c(rep(TRUE, 6), FALSE))
mahalanobisDist(lon, lat, method = "robust", n.min = 1,
geo = c(rep(TRUE, 6), FALSE))

## End(Not run)


LimaRAF/plantR documentation built on Jan. 1, 2023, 10:18 a.m.