herpSpecies: Retrieve Reptile Species and Taxonomic Information from RDB

View source: R/herpSpecies.R

herpSpeciesR Documentation

Retrieve Reptile Species and Taxonomic Information from RDB

Description

Retrieves a list of reptile species from The Reptile Database (RDB) based on a search URL, and optionally returns detailed taxonomic information for each species. This function can also save progress to disk during sampling and extract species-specific URLs for further use.

Usage

herpSpecies(url,
                   showProgress = TRUE,
                   dataList = NULL, 
                   taxonomicInfo=FALSE, 
                   fullHigher=FALSE, 
                   getLink=FALSE,
                   cores = max(1, parallel::detectCores() - 1),
                   checkpoint = NULL,
                   backup_file = NULL
                   )

Arguments

url

Character string. A search URL generated via an advanced search on the RDB website or with herpAdvancedSearch.

showProgress

Logical. If TRUE, prints sampling progress in the console. Default is FALSE.

dataList

Optional. A data frame with columns species and url, used to extract taxonomic information from previously sampled species links.

taxonomicInfo

Logical. If TRUE, returns taxonomic information for each species, including order, suborder, family, genus, author, and year. Default is FALSE.

fullHigher

Logical. If TRUE, includes the full higher taxonomic hierarchy as reported by RDB (e.g., including subfamilies). Requires taxonomicInfo = TRUE. Default is FALSE.

getLink

Logical. If TRUE, includes the RDB URL for each species (useful for follow-up functions like herpSynonyms). Default is FALSE.

cores

Integer. Number of CPU cores to use for parallel processing. Default is one less than the number of available cores.

checkpoint

Optional. Integer specifying the number of species to process before saving a temporary backup. Backup is only saved if cores = 1. If set to 1, saves progress after each species (safest but slowest).

backup_file

Optional. Character string specifying the path to an .rds file for saving intermediate results when checkpoint is set. Must end in .rds.

Value

If taxonomicInfo = FALSE (default), returns a character vector of species names.

If taxonomicInfo = TRUE, returns a data frame with columns: order, suborder (if available), family, genus, species, author, and year.

If fullHigher = TRUE, includes an additional column with the full higher taxa classification.

If getLink = TRUE, includes a column with the URL for each species’ page on RDB.

Note

If checkpoint is used, progress will only be saved when cores = 1. This prevents potential write conflicts in parallel mode.

See Also

herpAdvancedSearch, herpSynonyms, herpSearch

Examples


boa <- herpSpecies(herpAdvancedSearch(genus = "Boa"),
                                      taxonomicInfo = TRUE, 
                                      cores = 2)



letsHerp documentation built on June 23, 2025, 5:09 p.m.