re_match: Extract Regular Expression Matches Into a Data Frame
In MangoTheCat/rematch2: Tidy Output from Regular Expression Matching

Description Usage Arguments Value Note See Also Examples

View source: R/package.R

re_match wraps regexpr and returns the match results in a convenient data frame. The data frame has one column for each capture group if perl=TRUE, and one final columns called .match for the matching (sub)string. The columns of the capture groups are named if the groups themselves are named.

1	re_match(text, pattern, perl = TRUE, ...)

`text`	Character vector.
`pattern`	A regular expression. See `regex` for more about regular expressions.
`perl`	logical should perl compatible regular expressions be used? Defaults to TRUE, setting to FALSE will disable capture groups.
`...`	Additional arguments to pass to `regexpr`.

A data frame of character vectors: one column per capture group, named if the group was named, and additional columns for the input text and the first matching (sub)string. Each row corresponds to an element in the text vector.

re_match uses PCRE compatible regular expressions by default (i.e. perl = TRUE in regexpr). You can switch this off but if you do so capture groups will no longer be reported as they are only supported by PCRE.

Other tidy regular expression matching: re_exec_all, re_exec, re_match_all

dates <- c("2016-04-20", "1977-08-08", "not a date", "2016",
  "76-03-02", "2012-06-30", "2015-01-21 19:58")
isodate <- "([0-9]{4})-([0-1][0-9])-([0-3][0-9])"
re_match(text = dates, pattern = isodate)

# The same with named groups
isodaten <- "(?<year>[0-9]{4})-(?<month>[0-1][0-9])-(?<day>[0-3][0-9])"
re_match(text = dates, pattern = isodaten)