find_match_str: Find a string in a database of known strings

Description Usage Arguments Details Value

View source: R/find_match_str.R

Description

Function to check a string that has problems due to incomplete transcription, OCR problems, illegible words, and other problems.

Usage

1
2
3
find_match_str(str_to_check, database, method = "osa", no_cores = 2,
  year_limits = FALSE, country_limits = FALSE, database_strings = NA,
  str_to_check_col = NA)

Arguments

str_to_check

String to check against the database

database

Database of known strings

method

Which method to use to try to find a match, See Details below

no_cores

How many cores to dedicate to this function execution

year_limits

Boolean whether to use year to limit the strings available to match

country_limits

Boolean whether to use country to limit the strings available to match

database_strings

Which column to use for matching in database, only needed if it is more than one columns

str_to_check_col

Which column to use for matching in str_to_check, only needed if it is more than one columns

Details

The method arguments is passed to stringdist::stringdist(). To see the details of the methods available, see ?stringdist::`stringdist-metrics`.

Value

A dataframe of str_to_check (the string provided), match (the string matched from the database), score (the score for this string).


Smithsonian/collexScrub documentation built on July 19, 2019, 6:59 p.m.