occThin: Occurrence thinning

View source: R/occThin.R

occThinR Documentation

Occurrence thinning

Description

Occurrence thinning

Usage

occThin(
  occ = NA,
  xCol = NULL,
  yCol = NULL,
  thinDist = 0,
  isLatLong = TRUE,
  quiet = TRUE
)

Arguments

occ

Data.frame. Data matrix to be thinned with at least two columns storing longitude (X) and latitude (Y) coordinates.

xCol

Integer or Character. The column index or name of the longitude or X coordinate.

yCol

Integer or Character. The column index or name of the latitude or Y coordinate.

thinDist

Numeric. The distance in kilometres used to filter points.

isLatLong

Logical. Are the coordinates latitude and longitude? Default is TRUE. If set to FALSE, the points are assumed to be on a projection with X & Y coordinates in metres relative an origin.

quiet

Logical. Should progress messages be emitted? Default is FALSE.

Details

This function is based on source code for the function ecospat.occ.dessagregation() written by Olivier Broennimann and included in the R-package ecospat. I think that the original algorithm is extremely clever and very efficient; it out-performs the thin() function in package spThin by at least an order of magnitude, and is faster than my "fast" interpretation of the thin() algorithm by a factor of at least 5. However, the original code for this function had a number of quirks which made it tricky to use in a "production environment" ie for bulk processing hundreds or even thousands of species occurrence files. The following changes and improvements where made:

Plotting

The original source code could generate plots. I decided to leave that out allowing users to make their own plots.

Input data.frame

The original code used a rather odd and convoluted way of identifying and using columns representing the x- and y-coordinates which ultimately meant that the object returned was not the full input dataframe with 'bad' rows removed. This function does return the original dataframe with bad rows removed.

Parameters

Simplified the suite of parameters (or arguments) and giving them more meaningful names.

X & Y columns

The original code used a rather odd and convoluted way of identifying and using columns representing the x- and y-coordinates which ultimately meant that the object returned was not the full input data.frame with 'bad' rows removed. This function does return the original but thinned data.frame.

Value

A data.frame with exactly the same column structure as passed in parameter occ, but with rows for occurrence records less than thinDist from nearest neighbours removed.


peterbat1/fitMaxnet documentation built on Sept. 17, 2024, 10:50 p.m.