string_split | R Documentation |
Splits a character string with respect to pattern
string_split(
x,
split,
simplify = TRUE,
fixed = FALSE,
ignore.case = FALSE,
word = FALSE,
envir = parent.frame()
)
stsplit(
x,
split,
simplify = TRUE,
fixed = FALSE,
ignore.case = FALSE,
word = FALSE,
envir = parent.frame()
)
x |
A character vector. |
split |
A character scalar. Used to split the character vectors. By default
this is a regular expression. You can use flags in the pattern in the form |
simplify |
Logical scalar, default is |
fixed |
Logical, default is |
ignore.case |
Logical scalar, default is |
word |
Logical scalar, default is |
envir |
Environment in which to evaluate the interpolations if the flag |
If simplify = TRUE
(default), the object returned is:
a character vector if x
, the vector in input, is of length 1: the character vector contains
the result of the split.
a list of the same length as x
. The ith element of the list is a character vector
containing the result of the split of the ith element of x
.
If simplify = FALSE
, the object returned is always a list.
stsplit()
: Alias to string_split
All stringmagic
functions support generic flags in regular-expression patterns.
The flags are useful to quickly give extra instructions, similarly to usual
regular expression flags.
Here the syntax is "flag1, flag2/pattern". That is: flags are a comma separated list of flag-names
separated from the pattern with a slash (/
). Example: string_which(c("hello...", "world"), "fixed/.")
returns 1
.
Here the flag "fixed" removes the regular expression meaning of "." which would have otherwise meant "any character".
The no-flag verion string_which(c("hello...", "world"), ".")
returns 1:2
.
Alternatively, and this is recommended, you can collate the initials of the flags instead of using a comma separated list. For example: "if/dt[" will apply the flags "ignore" and "fixed" to the pattern "dt[".
The four flags always available are: "ignore", "fixed", "word" and "magic".
"ignore" instructs to ignore the case. Technically, it adds the perl-flag "(?i)" at the beginning of the pattern.
"fixed" removes the regular expression interpretation, so that the characters ".", "$", "^", "[" (among others) lose their special meaning and are treated for what they are: simple characters.
"word" adds word boundaries ("\\b"
in regex language) to the pattern. Further, the comma (","
)
becomes a word separator. Technically, "word/one, two" is treated as "\b(one|two)\b". Example:
string_clean("Am I ambushed?", "wi/am")
leads to " I ambushed?" thanks to the flags "ignore" and "word".
"magic" allows to interpolate variables inside the pattern before regex interpretation.
For example if letters = "aiou"
then string_clean("My great goose!", "magic/[{letters}] => e")
leads to "My greet geese!"
time = "This is the year 2024."
# we break the sentence
string_split(time, " ")
# simplify = FALSE leads to a list
string_split(time, " ", simplify = FALSE)
# let's break at "is"
string_split(time, "is")
# now breaking at the word "is"
# NOTE: we use the flag `word` (`w/`)
string_split(time, "w/is")
# same but using a pattern from a variable
# NOTE: we use the `magic` flag
pat = "is"
string_split(time, "mw/{pat}")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.