formatLoc: Format Locality Information

View source: R/formatLoc.R

formatLocR Documentation

Format Locality Information

Description

This function standardizes names of administrative levels from species occurrences obtained from on-line databases, such as GBIF or speciesLink

Usage

formatLoc(
  x,
  select.cols = c("loc", "loc.correct", "latitude.gazetteer", "longitude.gazetteer",
    "resolution.gazetteer"),
  loc.levels = c("country", "stateProvince", "municipality", "locality"),
  scrap = TRUE,
  adm.names = c("country.new", "stateProvince.new", "municipality.new"),
  loc.names = c("locality.new", "locality.scrap", "resol.orig"),
  str.names = c("resol.orig", "loc.string", "loc.string1", "loc.string2"),
  gazet = "plantR",
  gazet.names = c("loc", "loc.correct", "latitude.gazetteer", "longitude.gazetteer",
    "resolution.gazetteer"),
  orig.names = FALSE
)

Arguments

x

a data frame, containing typical fields from occurrence records from herbarium specimens

select.cols

a vector with the column names that should be added to the input data.frame. By default only the additional columns retrieved from the gazetteer are returned and the locality strings used in the researched are discarded.

loc.levels

a vector containing the names of the locality fields to be formatted.

scrap

logical. Should the search of missing locality information be performed? Default to TRUE.

adm.names

a vector of columns names containing the country, state/province and municipality information, in this order. Defaults to 'country.new', 'stateProvince.new' and 'municipality.new'.

loc.names

an vector of columns names containing the locality (original and alternative) and the resolution of the locality information. Defaults to 'locality.new', 'locality.scrap' and 'resol.orig'.

str.names

a vector of at least two columns names containing the locality resolution and search string(s), in that order. Defaults to 'resol.orig', 'loc.string', 'loc.string1' and 'loc.string2'.

gazet

a data.frame containing the gazetteer. The default is "plantR", the internal plantR gazetteer (biased towards Latin America).

gazet.names

a vector of at least four columns names containing the locality search string, latitude and longitude, in that order. If available, the resolution of the gazetteer can be provided as a fifth name. Defaults to columns names of the plantR gazetteer: 'loc', 'loc.correct', 'latitude.gazetteer', 'longitude.gazetteer' and 'resolution.gazetteer'.

orig.names

logical. Should the original columns names of the gazetteer be preserved. Default to FALSE.

Details

The function works as a wrapper, where the individuals steps of the proposed plantR workflow for editing locality information are performed altogether (see the plantR tutorial for details).

The input data frame usually contains the following locality fields: "country", "stateProvince", "municipality" and "locality".

Value

The input data frame x, plus the new columns with the formatted information.

See Also

fixLoc, strLoc, prepLoc, and getLoc.


LimaRAF/plantR documentation built on Jan. 1, 2023, 10:18 a.m.