compare: Title

Description Usage Arguments Value Examples

View source: R/stringSim.R

Description

Title

Usage

1
2
compare(source_strings, target_strings, flatten = TRUE,
  max_top_hits = 5, minimum_similarity = 0.7)

Arguments

source_strings

a vector of strings, each of which is processed to find the most similar strings amongst the vector of target strings

target_strings

a vector of strings against which each source string is compared

flatten

a boolean value indicated whether or not strings should be processed in flattened form (i.e. lower cased, ASCII, latin alphabet, etc.)

max_top_hits

an integer indicating how many 'top hits' for each source string should be made available for detailed examination

minimum_similarity

a decimal between 0 and 1 indicating how similar a hit must be to be counted as potentially relevant (1 for identical hits only)

Value

an object capable of further interrogation to extract outputs of the similarity comparisons between the source and target string vectors

Examples

1
2
3
4
source_vec = c("Hello", "Hi", "Hello you")
target_vec = c("Hell", "Hello how are you", "Hiya", "Hell, hello you")
outputs = similr::compare(source_vec, target_vec)
outputs$outputs$raw_scores

similr documentation built on Nov. 20, 2018, 1:09 a.m.