View source: R/NaiveTokenizer.R
Simple Tokenizer to split words among punctuation and whitespaces. If possible, prefer a DL Tokenizer. WARNING: This tokenizer is build for the english language and can be applied to other latin-based or cyrillic-based languages. This tokenizer does not work on other alphabets like chinese, devanagari, thai, japanese, hebrew or arabic.
1 | naiveTokenizer(string)
|
string |
character string to be tokenized |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.