OccurrenceCollection: Download and vet GBIF occurrence data

View source: R/OccurrenceCollection.R

OccurrenceCollectionR Documentation

Download and vet GBIF occurrence data

Description

Takes a list of species and collects occurrence data from GBIF (Global Biodiversity Information Facility). Acts as a wrapper for rgbif::occ_search; however, this function is much more efficient for a large number of species. It also checks the taxonomy of the given species list against the GBIF taxonomy, renaming or merging taxa if necessary. Furthermore, this function vets the occurrence data, removing occurrence points that are of insufficient quality for species distribution modelling. Finally, OccurrenceCollection() provides the number of occurrences found within given training and study areas.

Usage

OccurrenceCollection(spplist, output, trainingarea = NA, studyarea = NA)

Arguments

spplist

a vector of scientific names, using GBIF taxonomy. Names can be species or subspecies.

output

A full directory name where the downloaded species occurrences will be written to.

trainingarea

Extent object, or vector of desired training extent in form c(xmin, xmax, ymin, ymax). Given in latlong coordinates. If set to "NA" (the default), all occurrence points will be downloaded (up to 99,999 points), regardless of their location.

studyarea

(optional) Extent object, or vector of the area that the SDM will be projected on in form c(xmin, xmax,ymin, ymax). Given in latlong coordinates. If provided, the number of occurrence points found within this study region will be calculated.

Value

Writes .csv files of GBIF occurrences to a directory provided by output. If any species failed (for example, the scientific name was not found in the search or no occurrences exist within a provided trainingarea), another .csv file is written out in the same folder with the names of the species that failed. In addition, a dataframe is returned by the function that contains the high taxonomy of each species and the nubmer of occurrences found within the trainingarea and, optionally, the studyarea.

List of occurrences removed by OccurrenceCollection:

  1. Fossil specimens (mitigates the effect of long-term climiatic changes on the SDM)

  2. "cdiv": Coordinate Invalid

  3. "cdout": Coordinate Out of Range

  4. "cdrepf": Coordinate Reprojection Failed

  5. "cdreps": Coordinate Reprojection Suspicious

  6. "gdativ": Geodetic Datum Invalid

  7. "preneglat": Presumed Negated Latitude

  8. "preneglon": Presumed Negated Longitude

  9. "preswcd": Presumed Swapped Coordinates

  10. "txmatnon": No Taxon Match

  11. "zeocd": Exact 0/0 Coordinate.

Further vetting may be done by hand.


brshipley/megaSDM documentation built on Nov. 26, 2024, 6:08 a.m.