refinr: Cluster and Merge Similar Values Within a Character Vector

refinrR Documentation

Cluster and Merge Similar Values Within a Character Vector


These functions take a character vector as input, identify and cluster similar values, and then merge clusters together so their values become identical. The functions are an implementation of the key collision and ngram fingerprint algorithms from the open source tool Open Refine.

Documentation for Open Refine

Development links

refinr features the following functions

  • key_collision_merge

  • n_gram_merge


Maintainer: Chris Muir

See Also

Useful links:

refinr documentation built on Nov. 13, 2023, 1:09 a.m.