fuzzy_search: Match misspelled or partial scientific names

View source: R/helper_methods.R

fuzzy_searchR Documentation

Match misspelled or partial scientific names

Description

Match misspelled or partial scientific names

Usage

fuzzy_search(
  x,
  term,
  sensitivity = 0,
  allow_term_removal = FALSE,
  force_binomial = FALSE
)

Arguments

x

A tibble created with load_taxonomies() or load_population() or load_sample().

term

A string consisting of a scientific name.

sensitivity

An integer representing character mismatch tolerance. Defaults to intolerant i.e. sensitivity=0.

allow_term_removal

A logical indicating whether searches against only the first word of term should be carried out if no matches are found. Defaults to FALSE.

force_binomial

A logical indicating whether term should be stripped to a maximum of two words. Defaults to FALSE.

Details

The sensitivity parameter sets the number of character mismatches that are tolerated for a match to be reported. The higher the sensitivity, the more matches will be found, but the less relevant they may be. The allow_term_removal parameter allows stripping the search query to only retain the characters before the first occurrence of a white space i.e. only the first word of a search query is used during the search. This is useful when "Genus sp." or "Genus indet." is the search query. However, fuzzy_search() will always search using the entire search query first and then only proceed to strip terms if no hits are found. On the other hand, if force_binomial is set to TRUE, the search query will first be limited to the first two words before searching commences. This in turn is useful if the search query includes credit to the publisher e.g. "Birgus latro (Linnaeus, 1767)" or to prevent subspecies names (so-called trinomials) from leading to a match not being made.

Value

A list of candidate match(es), if applicable.

Examples

fuzzy_search(load_sample(), "Miacis deutschi")
fuzzy_search(load_sample(), "Miacis sp.", allow_term_removal = TRUE)
fuzzy_search(load_sample(), "Miacus deutschi", sensitivity = 1)
fuzzy_search(load_sample(), "Miacis deutschi (Smith, 2022)", force_binomial = TRUE)

MoultDB/moultdbtools documentation built on Feb. 2, 2024, 5:21 p.m.