create_fingerprint | R Documentation |
This function creates a fingerprint of a string. This can be used for de-duplication or calculation of string similarity or string distance. It is bases on normalised tokens and implements Open Refine's clustering algorithm, precisly the Fingerprint Key Collision See https://github.com/OpenRefine/OpenRefine/wiki/Clustering-In-Depth
create_fingerprint(string, tokens = "word", n = NULL)
string |
input string |
tokens |
how to generate tokens? |
n |
The number of characters in each shingle. If |
character string
create_fingerprint("Max Spohr Verlag", token = "word") create_fingerprint("Max Spohr Verlag", token = "ngram", n = 2)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.