sjackman/uniqtag: Abbreviate Strings to Short, Unique Identifiers

For each string in a set of strings, determine a unique tag that is a substring of fixed size k unique to that string, if it has one. If no such unique substring exists, the least frequent substring is used. If multiple unique substrings exist, the lexicographically smallest substring is used. This lexicographically smallest substring of size k is called the "UniqTag" of that string.

Getting started

Package details

Maintainer
LicenseMIT + file LICENSE
Version1.0.0
URL https://github.com/sjackman/uniqtag
Package repositoryView on GitHub
Installation Install the latest version of this package by entering the following in R:
install.packages("devtools")
library(devtools)
install_github("sjackman/uniqtag")
sjackman/uniqtag documentation built on April 6, 2018, 6:53 a.m.