Description Usage Arguments Value Tidy Data Extracting Match Data See Also Examples
Match a regular expression to a string, and return matches, match positions,
and capture groups. This function is like its
match
counterpart, except it returns match/capture
group start and end positions in addition to the matched values.
1 2 3 4 5 6 7 |
text |
Character vector. |
pattern |
A regular expression. See |
perl |
logical should perl compatible regular expressions be used? Defaults to TRUE, setting to FALSE will disable capture groups. |
... |
Additional arguments to pass to
|
x |
Object returned by |
name |
|
A tidy data frame (see Section “Tidy Data”). Match record entries are one length vectors that are set to NA if there is no match.
The return value is a tidy data frame where each row
corresponds to an element of the input character vector text
. The
values from text
appear for reference in the .text
character
column. All other columns are list columns containing the match data. The
.match
column contains the match information for full regular
expression matches while other columns correspond to capture groups if there
are any, and PCRE matches are enabled with perl = TRUE
(this is on by
default). If capture groups are named the corresponding columns will bear
those names.
Each match data column list contains match records, one for each element in
text
. A match record is a named list, with entries match
,
start
and end
that are respectively the matching (sub) string,
the start, and the end positions (using one based indexing).
To make it easier to extract matching substrings or positions, a special
$
operator is defined on match columns, both for the .match
column and the columns corresponding to the capture groups. See examples
below.
regexpr
, which this function wraps
Other tidy regular expression matching:
re_exec_all()
,
re_match_all()
,
re_match()
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | name_rex <- paste0(
"(?<first>[[:upper:]][[:lower:]]+) ",
"(?<last>[[:upper:]][[:lower:]]+)"
)
notables <- c(
" Ben Franklin and Jefferson Davis",
"\tMillard Fillmore"
)
# Match first occurrence
pos <- re_exec(notables, name_rex)
pos
# Custom $ to extract matches and positions
pos$first$match
pos$first$start
pos$first$end
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.