Description Usage Arguments Examples
Replaces tokens by multiple replacements consisting of elements split by a
separator pattern, with the option of retaining the separator. This function
effectively reverses the operation of tokens_compound()
.
1 2 3 4 5 6 | tokens_split(
x,
separator = " ",
valuetype = c("fixed", "regex"),
remove_separator = TRUE
)
|
x |
a tokens object |
separator |
a single-character pattern match by which tokens are separated |
valuetype |
the type of pattern matching: |
remove_separator |
if |
1 2 3 4 5 6 7 8 9 | # undo tokens_compound()
toks1 <- tokens("pork barrel is an idiomatic multi-word expression")
tokens_compound(toks1, phrase("pork barrel"))
tokens_compound(toks1, phrase("pork barrel")) %>%
tokens_split(separator = "_")
# similar to tokens(x, remove_hyphen = TRUE) but post-tokenization
toks2 <- tokens("UK-EU negotiation is not going anywhere as of 2018-12-24.")
tokens_split(toks2, separator = "-", remove_separator = FALSE)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.