Convert natural language text into tokens. The tokenizers have a consistent interface and are compatible with Unicode, thanks to being built on the 'stringi' package. Includes tokenizers for shingled n-grams, skip n-grams, words, word stems, sentences, paragraphs, characters, lines, and regular expressions.
|Author||Lincoln Mullen [aut, cre], Dmitriy Selivanov [ctb]|
|Date of publication||2016-08-29 22:59:29|
|Maintainer||Lincoln Mullen <email@example.com>|
|License||MIT + file LICENSE|