uniqtag: Abbreviate Strings to Short, Unique Identifiers

For each string in a set of strings, determine a unique tag that is a substring of fixed size k unique to that string, if it has one. If no such unique substring exists, the least frequent substring is used. If multiple unique substrings exist, the lexicographically smallest substring is used. This lexicographically smallest substring of size k is called the "UniqTag" of that string.

Author
Shaun Jackman [cre]
Date of publication
2015-04-29 01:17:59
Maintainer
Shaun Jackman <sjackman@gmail.com>
License
MIT + file LICENSE
Version
1.0
URLs

View on CRAN

Man pages

cumcount
Cumulative count of strings.
kmers_of
Return the k-mers of a string.
make_unique
Make character strings unique.
uniqtag
Abbreviate strings to short, unique identifiers.
uniqtag-package
Abbreviate strings to short, unique identifiers.

Files in this package

uniqtag
uniqtag/tests
uniqtag/tests/testthat.R
uniqtag/tests/testthat
uniqtag/tests/testthat/test-kmers-of.R
uniqtag/tests/testthat/test-make-unique.R
uniqtag/tests/testthat/test-uniqtag.R
uniqtag/NAMESPACE
uniqtag/R
uniqtag/R/uniqtag.R
uniqtag/README.md
uniqtag/MD5
uniqtag/DESCRIPTION
uniqtag/man
uniqtag/man/kmers_of.Rd
uniqtag/man/cumcount.Rd
uniqtag/man/uniqtag-package.Rd
uniqtag/man/uniqtag.Rd
uniqtag/man/make_unique.Rd
uniqtag/LICENSE