dan-reznik/clustringr: Cluster Strings by Edit-Distance

Returns an edit-distance based clusterization of an input vector of strings. Each cluster will contain a set of strings w/ small mutual edit-distance (e.g., levenshtein, optimum-sequence-alignment, damerau-lev), as computed by stringdist::stringdist(). The set of all mutual edit-distances is then used by g graph algorithms (from package igraph) to single out subsets of high connectivity.

Getting started

Package details

AuthorDan S. Reznik
MaintainerDan S. Reznik <dreznik@gmail.com>
LicenseMIT + file LICENSE
Version1.0
Package repositoryView on GitHub
Installation Install the latest version of this package by entering the following in R:
install.packages("remotes")
remotes::install_github("dan-reznik/clustringr")
dan-reznik/clustringr documentation built on May 20, 2019, 12:35 p.m.