| .mp_tokenize_word_bidir | R Documentation | 
Apply .mp_tokenize_word from both directions and pick the result with fewer pieces.
.mp_tokenize_word_bidir( word, vocab_split, unk_token, max_chars, allow_compounds = TRUE )
word | 
 Character scalar; word to tokenize.  | 
vocab_split | 
 List of character vectors containing vocabulary words. Should have components named "prefixes", "words", "suffixes".  | 
unk_token | 
 Token to represent unknown words.  | 
max_chars | 
 Maximum length of word recognized.  | 
allow_compounds | 
 Logical; whether to allow multiple whole words in the breakdown. Default is TRUE. This option will not be exposed to end users; it is kept here for documentation + development purposes.  | 
Input word as a list of tokens.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.