clustringr: Cluster Strings by Edit-Distance

Returns an edit-distance based clusterization of an input vector of strings. Each cluster will contain a set of strings w/ small mutual edit-distance (e.g., Levenshtein, optimum-sequence-alignment, Damerau-Levenshtein), as computed by stringdist::stringdist(). The set of all mutual edit-distances is then used by graph algorithms (from package 'igraph') to single out subsets of high connectivity.

Getting started

Package details

AuthorDan S. Reznik
MaintainerDan S. Reznik <[email protected]>
LicenseMIT + file LICENSE
Package repositoryView on CRAN
Installation Install the latest version of this package by entering the following in R:

Try the clustringr package in your browser

Any scripts or data that you put into this service are public.

clustringr documentation built on May 1, 2019, 9:23 p.m.