tokens_split | R Documentation |
Replaces tokens by multiple replacements consisting of elements split by a
separator pattern, with the option of retaining the separator. This function
effectively reverses the operation of tokens_compound()
.
tokens_split(
x,
separator = " ",
valuetype = c("fixed", "regex"),
remove_separator = TRUE,
apply_if = NULL
)
x |
a tokens object |
separator |
a single-character pattern match by which tokens are separated |
valuetype |
the type of pattern matching: |
remove_separator |
if |
apply_if |
logical vector of length |
# undo tokens_compound()
toks1 <- tokens("pork barrel is an idiomatic multi-word expression")
tokens_compound(toks1, phrase("pork barrel"))
tokens_compound(toks1, phrase("pork barrel")) |>
tokens_split(separator = "_")
# similar to tokens(x, remove_hyphen = TRUE) but post-tokenization
toks2 <- tokens("UK-EU negotiation is not going anywhere as of 2018-12-24.")
tokens_split(toks2, separator = "-", remove_separator = FALSE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.