Description Usage Arguments Value
When n=1
simply tokenize text and emit words with counts. When n>1
tokenized words are combined into permutations of length n within
each document.
1 2 3 |
n |
number of words |
tokenSep |
a character string to separate the tokens when |
ignoreCase |
logical: treat text as-is ( |
delimiter |
character or string that divides one word from the next.
You can use a regular expression as the |
punctuation |
a regular expression that specifies the punctuation characters parser will remove before it evaluates the input text. |
stemming |
logical: If true, apply Porter2 Stemming to each token to reduce
it to its root form. Default is |
stopWords |
logical or string with the name of the file that contains stop words. If TRUE then that should be ignored when parsing text. Each stop word is specified on a separate line. |
sep |
a character string to separate multiple text columns. |
minLength |
exclude tokens shorter than minLength characters. |
pluggable token parser
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.