| validate_peru_mammals | R Documentation |
Matches given species names against the official list of mammal species of Peru (Pacheco et al. 2021). Uses a hierarchical matching strategy that includes direct matching, genus-level matching, and fuzzy matching to maximize successful matches while maintaining accuracy.
Peru Mammals Database:
575 mammal species
Binomial nomenclature only (no infraspecific taxa)
Includes 6 undescribed species ("sp." cases)
Fields: genus, species, scientific_name, common_name, family, order, endemic
validate_peru_mammals(splist, quiet = TRUE)
splist |
A character vector containing the species names to be matched. Names can be in any format (uppercase, lowercase, with underscores, etc.). Duplicate names are preserved in the output. |
quiet |
Logical, default TRUE. If FALSE, prints informative messages about the matching progress. |
Matching Strategy: The function implements a hierarchical matching pipeline:
Node 1 - Direct Match: Exact matching of binomial names (genus + species)
Node 2 - Genus Match: Exact matching at genus level
Node 3 - Fuzzy Genus: Fuzzy matching for genus with typos (max distance = 1)
Node 4 - Fuzzy Species: Fuzzy matching for species within matched genus
Special Cases:
Handles "sp." cases: "Akodon sp. Ancash", "Oligoryzomys sp. B", etc.
Case-insensitive matching
Removes common qualifiers (CF., AFF.)
Standardizes spacing and formatting
Rank System:
Rank 1: Genus level only (e.g., "Panthera")
Rank 2: Binomial (genus + species, e.g., "Panthera onca")
Ambiguous Matches:
When multiple candidates have identical fuzzy match scores, a warning is
issued and the first match is selected. Use get_ambiguous_matches()
to examine these cases.
Input Requirements:
Species names must be provided as binomials (Genus species) WITHOUT:
Author information: Panthera onca Linnaeus"
Infraspecific taxa: "Panthera onca onca"
Parenthetical authors: "Panthera onca (Linnaeus, 1758)"
Valid formats:
Standard binomial: "Panthera onca"
Undescribed species: "Akodon sp. Ancash"
Case-insensitive: "PANTHERA ONCA" or "panthera onca"
Names with 3+ elements will be automatically rejected with a warning.
A tibble with the following columns:
Integer. Original position in input vector
Character. Original input name (standardized)
Character. Matched name from database or "—"
Character. Quality of match ("Exact rank", "No match", etc.)
Logical. Whether a match was found
Integer. Input taxonomic rank (1 or 2)
Integer. Matched taxonomic rank (1 or 2)
Logical. Whether ranks match exactly
Logical. Whether match is valid at correct rank
Character. Input genus (uppercase)
Character. Input species (uppercase)
Character. Taxonomic authority if provided
Character. Matched genus (uppercase)
Character. Matched species (uppercase)
Integer. Edit distance for genus (0=exact, >0=fuzzy, NA=no match)
Integer. Edit distance for species (0=exact, >0=fuzzy, NA=no match or genus-only)
Character. Scientific name from peru_mammals
Character. Common name in Spanish
Character. Family
Character. Order
Logical. Endemic to Peru?
Attributes:
The output includes metadata accessible via attr():
target_database: "peru_mammals"
matching_date: Date of matching
n_input: Number of input names
n_matched: Number of successful matches
match_rate: Percentage of successful matches
n_fuzzy_genus: Number of fuzzy genus matches
n_fuzzy_species: Number of fuzzy species matches
ambiguous_genera: Ambiguous genus matches (if any)
ambiguous_species: Ambiguous species matches (if any)
get_ambiguous_matches to retrieve ambiguous match details
# Basic usage
species_list <- c("Panthera onca", "Tremarctos ornatus", "Puma concolor")
results <- validate_peru_mammals(species_list)
# Check results
table(results$matched)
table(results$Match.Level)
# View matched species
results |>
dplyr::filter(matched) |>
dplyr::select(Orig.Name, Matched.Name, common_name, endemic)
# With typos (fuzzy matching)
typos <- c("Pumma concolor", "Tremarctos ornatu") # Spelling errors
results_fuzzy <- validate_peru_mammals(typos, quiet = FALSE)
# Check for ambiguous matches
get_ambiguous_matches(results_fuzzy, type = "genus")
# Access metadata
attr(results, "match_rate")
attr(results, "n_fuzzy_genus")
# With special "sp." cases
sp_cases <- c("Akodon sp. Ancash", "Oligoryzomys sp. B")
results_sp <- validate_peru_mammals(sp_cases)
# Should match exactly
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.