TNRS_robust: Resolve plant taxonomic names and re-do erroneous results

View source: R/TNRS_robust.R

TNRS_robustR Documentation

Resolve plant taxonomic names and re-do erroneous results

Description

Resolve plant taxonomic names robustly.

Usage

TNRS_robust(
  taxonomic_names,
  sources = c("wcvp", "wfo"),
  classification = "wfo",
  mode = "resolve",
  matches = "best",
  accuracy = NULL,
  skip_internet_check = FALSE,
  name_limit = 5000,
  attempts = 10,
  ...
)

Arguments

taxonomic_names

Data.frame containing two columns: 1) Row number, 2) Taxonomic names to be resolved (or parsed). Note that these two columns must be in this order. Alternatively, a character vector of names can be supplied.

sources

Character. Taxonomic sources to use. Default is c("wcvp", "wfo"). Options include "wfo", "wcvp", and "cact". Use TNRS_sources() for more information.

classification

Character. Family classification to use. Currently options include "wfo" (the default).

mode

Character. Options are "resolve" and "parse". Default option is "resolve"

matches

Character. Should all matches be returned ("all") or only the best match ("best", the default)?

accuracy

numeric. If specified, only matches with a score greater than or equal to the supplied accuracy level will be returned. If left NULL, the default threshold will be used.

skip_internet_check

Should the check for internet connectivity be skipped? Default is FALSE.

name_limit

Numeric. The maximum number of names to check in one batch. The default is 5000 and is usually the fastest option. This cannot exceed 5000.

attempts

Numeric. The number of times to re-try any erroneous results. Default is 10, but usually only one is needed at most.

...

Additional parameters passed to internal functions

Value

Dataframe containing TNRS results.

Note

wfo = World Flora Online, wcvp = World Checklist of Vascular Plants, cact = Cactaceae at Caryophyllales.org.

For queries of more than 5000 names, the function will automatically divide the query into batches of 5000 names and then run the batches one after the other. Thus, for very large queries this may take some time. When this is the case, a progress bar will be displayed.

IMPORTANT: Note that parallelization of queries is automatically handled by the API, and so there is no need to further parallelize in R (in fact, doing so may actually slow things down!).

Examples

## Not run: 
# Take a subset of the testfile to speed up runtime
tnrs_testfile <- tnrs_testfile[1:20, ]

results <- TNRS_robust(taxonomic_names = tnrs_testfile)

# Inspect the results
head(results, 10)

## End(Not run)


EnquistLab/RTNRS documentation built on Oct. 14, 2024, 2:11 p.m.