| string_split2df | R Documentation |
Splits a character vector and formats the resulting substrings into a data.frame
string_split2df(
x,
data = NULL,
split = NULL,
id = NULL,
add.pos = FALSE,
id_unik = TRUE,
fixed = FALSE,
ignore.case = FALSE,
word = FALSE,
envir = parent.frame(),
dt = FALSE,
...
)
string_split2dt(
x,
data = NULL,
split = NULL,
id = NULL,
add.pos = FALSE,
id_unik = TRUE,
fixed = FALSE
)
x |
A character vector or a two-sided formula. If a two-sided formula, then the
argument |
data |
Optional, only used if the argument |
split |
A character scalar. Used to split the character vectors. By default
this is a regular expression. You can use flags in the pattern in the form |
id |
Optional. A character vector or a list of vectors. If provided, the
values of |
add.pos |
Logical, default is |
id_unik |
Logical, default is |
fixed |
Logical, default is |
ignore.case |
Logical scalar, default is |
word |
Logical scalar, default is |
envir |
Environment in which to evaluate the interpolations if the flag |
dt |
Logical, default is |
... |
Not currently used. |
It returns a data.frame or a data.table which will contain: i) obs: the observation index,
ii) pos: the position of the text element in the initial string (optional, via add.pos),
iii) the text element, iv) the identifier(s) (optional, only if id was provided).
string_split2dt(): Splits a string vector and returns a data.table
String operations: string_is(), string_get(), string_clean(), string_split2df().
Chain basic operations with string_ops(). Clean character vectors efficiently
with string_clean().
Use string_vec() to create simple string vectors.
String interpolation combined with operation chaining: string_magic(). You can change string_magic
default values with string_magic_alias() and add custom operations with string_magic_register_fun().
Display messages while benefiting from string_magic interpolation with cat_magic() and message_magic().
Other tools with aliases:
cat_magic_alias(),
string_magic(),
string_magic_alias(),
string_ops_alias(),
string_vec_alias()
x = c("Nor rain, wind, thunder, fire are my daughters.",
"When my information changes, I alter my conclusions.")
id = c("ws", "jmk")
# we split at each word
string_split2df(x, "[[:punct:] ]+")
# we add the 'id'
string_split2df(x, "[[:punct:] ]+", id = id)
# TO NOTE:
# - the second argument is `data`
# - when it is missing, the argument `split` becomes implicitly the second
# - ex: above we did not use `split = "[[:punct:] ]+"`
#
# using the formula
base = data.frame(text = x, my_id = id)
string_split2df(text ~ my_id, base, "[[:punct:] ]+")
#
# with 2+ identifiers
base = within(mtcars, carname <- rownames(mtcars))
# we have a message because the identifiers are not unique
string_split2df(carname ~ am + gear + carb, base, " +")
# adding the position of the words & removing the message
string_split2df(carname ~ am + gear + carb, base, " +", id_unik = FALSE, add.pos = TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.