find_closest_strings: Find the closest pairs of strings from a vector of strings.

Description Usage Arguments Value Examples

Description

This is using the Jaro-Winckler distance to identify the closest strings, see https://journal.r-project.org/archive/2014-1/loo.pdf

Usage

1
find_closest_strings(strings_, quantile_ = 0.01)

Arguments

strings_

A vector of strings. If the vector contain duplicates strings, they will be removed.

quantile_

Proportion of the closest pairs of strings to return. For quantile_ = 0.1, the best 10 percent pairs are returned.

Value

A matrix containing the "quantile_" closest pairs of strings.

Examples

1
2
3
4
library(tidyverse)
read_csv('../data/RTT_dataV2.csv') %>%
    .$Identifier %>%
    find_closest_strings

konkam/RTTanalyse documentation built on May 20, 2019, 12:55 p.m.