regexprCapture | R Documentation |
Applies a (perl) regular expression with capture groups to text strings and
returns a matrix. Each matrix column is the text that one capture group
matched (in order), each matrix row is the outcome of applying that regexp to
one element of the text data. If a capture group does not match, the empty
string is returned unless use.na = TRUE
is set, it which case NA is
returned. In either case, if a capture group matches nothing (i.e. when * is
used to match 0 or more, and 0 match), an empty string is returned.
regexprCapture(re, data, use.na = FALSE)
re |
The (perl) regular expression as a string, with capture groups. May
use named capture groups ( |
data |
A vector of strings to search in. The rows in the returned matrix will be the captured text from successive elements of this vector. |
use.na |
Set TRUE to return NA as the matched text for capture groups that fail to match |
This is implemented using regexprMatches
A matrix with one column per regular expression capture group and one row per data element. Columns will be named if named capture groups are used.
# Capture group: (...)
# Named capture group: (?<name>...)
# Lazy quantifier: *?
regExp <- "\\s*(?<name>.*?)\\s*<\\s*(?<email>.+)\\s*>\\s*"
data <- c('Stuart R. Jefferys <srj@unc.edu>',
'nonya business <nobody@nowhere.com>',
'no email', '<just@an.email>' )
regexprCapture(regExp, data)
#=> name email
#=> [1,] "Stuart R. Jefferys" "srj@unc.edu"
#=> [2,] "nonya business" "nobody@nowhere.com"
#=> [3,] "" ""
#=> [4,] "" "just@an.email"
regexprCapture(regExp, data, use.na=TRUE)
#=> name email
#=> [1,] "Stuart R. Jefferys" "srj@unc.edu"
#=> [2,] "nonya business" "nobody@nowhere.com"
#=> [3,] NA NA
#=> [4,] "" "just@an.email"
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.