.mp_tokenize_single_string | R Documentation |
Tokenize an Input Word-by-word
.mp_tokenize_single_string(words, vocab, lookup, unk_token, max_chars)
words |
Character; a vector of words (generated by space-tokenizing a single input). |
vocab |
A morphemepiece vocabulary. |
lookup |
A morphemepiece lookup table. |
unk_token |
Token to represent unknown words. |
max_chars |
Maximum length of word recognized. |
A named integer vector of tokenized words.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.