same_text: Compare Text Similarity Across Lists

View source: R/1-same-text.R

same_textR Documentation

Compare Text Similarity Across Lists

Description

Compare Text Similarity Across Lists

Usage

same_text(
  ...,
  method = c("osa", "lv", "dl", "hamming", "lcs", "qgram", "cosine", "jaccard", "jw",
    "soundex"),
  q = 1,
  p = NULL,
  bt = 0,
  weight = c(d = 1, i = 1, s = 1, t = 1),
  digits = 3
)

Arguments

...

Lists of character strings to compare

method

Character vector of similarity methods from stringdist. Choose from: "osa", "lv", "dl", "hamming", "lcs", "qgram", "cosine", "jaccard", "jw", "soundex" (default: all)

q

Size of q-gram for q-gram based methods (default: 1)

p

Winkler scaling factor for "jw" method (default: 0.1)

bt

Booth matching threshold

weight

Vector of weights for operations: deletion (d), insertion (i), substitution (s), transposition (t)

digits

Number of digits to round results (default: 3)

Value

An S3 class object of type "similar_text" containing:

  • scores: Numeric similarity scores by method and comparison

  • summary: Summary statistics by method and comparison

  • methods: Methods used for comparison

  • list_names: Names of compared lists

Examples

list1 <- list("hello", "world")
list2 <- list("helo", "word")
result <- same_text(list1, list2)

samesies documentation built on April 4, 2025, 2:08 a.m.