Extract Brackets
In qdapRegex: Regular Expression Removal, Extraction, and Replacement Tools

rm_bracket

R Documentation

Remove/Replace/Extract Brackets

Description

Remove/replace/extract bracketed strings.

Usage

rm_bracket(
  text.var,
  pattern = "all",
  trim = TRUE,
  clean = TRUE,
  replacement = "",
  extract = FALSE,
  include.markers = ifelse(extract, FALSE, TRUE),
  dictionary = getOption("regex.library"),
  ...
)

rm_round(
  text.var,
  pattern = "(",
  trim = TRUE,
  clean = TRUE,
  replacement = "",
  extract = FALSE,
  include.markers = ifelse(extract, FALSE, TRUE),
  dictionary = getOption("regex.library"),
  ...
)

rm_square(
  text.var,
  pattern = "[",
  trim = TRUE,
  clean = TRUE,
  replacement = "",
  extract = FALSE,
  include.markers = ifelse(extract, FALSE, TRUE),
  dictionary = getOption("regex.library"),
  ...
)

rm_curly(
  text.var,
  pattern = "{",
  trim = TRUE,
  clean = TRUE,
  replacement = "",
  extract = FALSE,
  include.markers = ifelse(extract, FALSE, TRUE),
  dictionary = getOption("regex.library"),
  ...
)

rm_angle(
  text.var,
  pattern = "<",
  trim = TRUE,
  clean = TRUE,
  replacement = "",
  extract = FALSE,
  include.markers = ifelse(extract, FALSE, TRUE),
  dictionary = getOption("regex.library"),
  ...
)

rm_bracket_multiple(
  text.var,
  trim = TRUE,
  clean = TRUE,
  pattern = "all",
  replacement = "",
  extract = FALSE,
  include.markers = FALSE,
  merge = TRUE
)

ex_bracket(
  text.var,
  pattern = "all",
  trim = TRUE,
  clean = TRUE,
  replacement = "",
  extract = TRUE,
  include.markers = ifelse(extract, FALSE, TRUE),
  dictionary = getOption("regex.library"),
  ...
)

ex_bracket_multiple(
  text.var,
  trim = TRUE,
  clean = TRUE,
  pattern = "all",
  replacement = "",
  extract = TRUE,
  include.markers = FALSE,
  merge = TRUE
)

ex_angle(
  text.var,
  pattern = "<",
  trim = TRUE,
  clean = TRUE,
  replacement = "",
  extract = TRUE,
  include.markers = ifelse(extract, FALSE, TRUE),
  dictionary = getOption("regex.library"),
  ...
)

ex_round(
  text.var,
  pattern = "(",
  trim = TRUE,
  clean = TRUE,
  replacement = "",
  extract = TRUE,
  include.markers = ifelse(extract, FALSE, TRUE),
  dictionary = getOption("regex.library"),
  ...
)

ex_square(
  text.var,
  pattern = "[",
  trim = TRUE,
  clean = TRUE,
  replacement = "",
  extract = TRUE,
  include.markers = ifelse(extract, FALSE, TRUE),
  dictionary = getOption("regex.library"),
  ...
)

ex_curly(
  text.var,
  pattern = "{",
  trim = TRUE,
  clean = TRUE,
  replacement = "",
  extract = TRUE,
  include.markers = ifelse(extract, FALSE, TRUE),
  dictionary = getOption("regex.library"),
  ...
)

Arguments

`text.var`	The text variable.
`pattern`	The type of bracket (and encased text) to remove. This is one or more of the strings `"curly"`/`"{"`, `"square"`/`"["`, `"round"`/`"("`, `"angle"`/`"<"` and `"all"`. These strings correspond to: {, [, (, < or all four types.
`trim`	logical. If `TRUE` removes leading and trailing white spaces.
`clean`	trim logical. If `TRUE` extra white spaces and escaped character will be removed.
`replacement`	Replacement for matched `pattern`.
`extract`	logical. If `TRUE` the bracketed text is extracted into a list of vectors.
`include.markers`	logical. If `TRUE` and `extract = TRUE` returns the markers (left/right) and the text between.
`dictionary`	A dictionary of canned regular expressions to search within if `pattern` begins with `"@rm_"`.
`...`	Other arguments passed to `gsub`.
`merge`	logical. If `TRUE` the results of each bracket type will be merged by string. `FALSE` returns a named list of lists of vectors of bracketed text per bracket type.

Value

rm_bracket - returns a character string with multiple brackets removed. If extract = TRUE the results are optionally merged and named by bracket type. This is more flexible than rm_bracket but slower.

rm_round - returns a character string with round brackets removed.

rm_square - returns a character string with square brackets removed.

rm_curly - returns a character string with curly brackets removed.

rm_angle - returns a character string with angle brackets removed.

rm_bracket_multiple - returns a character string with multiple brackets removed. If extract = TRUE the results are optionally merged and named by bracket type. This is more flexible than rm_bracket but slower.

Author(s)

Martin Morgan and Tyler Rinker <tyler.rinker@gmail.com>.

References

https://stackoverflow.com/q/8621066/1000343

Examples

examp <- structure(list(person = structure(c(1L, 2L, 1L, 3L),
    .Label = c("bob", "greg", "sue"), class = "factor"), text =
    c("I love chicken [unintelligible]!",
    "Me too! (laughter) It's so good.[interrupting]",
    "Yep it's awesome {reading}.", "Agreed. {is so much fun}")), .Names =
    c("person", "text"), row.names = c(NA, -4L), class = "data.frame")

examp
rm_bracket(examp$text, pattern = "square")
rm_bracket(examp$text, pattern = "curly")
rm_bracket(examp$text, pattern = c("square", "round"))
rm_bracket(examp$text)

ex_bracket(examp$text, pattern = "square")
ex_bracket(examp$text, pattern = "curly")
ex_bracket(examp$text, pattern = c("square", "round"))
ex_bracket(examp$text, pattern = c("square", "round"), merge = FALSE)
ex_bracket(examp$text)
ex_bracket(examp$tex, include.markers=TRUE)

## Not run: 
library(qdap)
ex_bracket(examp$tex, pattern="curly") %>% 
  unlist() %>% 
  na.omit() %>% 
  paste2()

## End(Not run)

x <- "I like [bots] (not). And <likely> many do not {he he}"

rm_round(x)
ex_round(x)
ex_round(x, include.marker = TRUE)

rm_square(x)
ex_square(x)

rm_curly(x)
ex_curly(x)

rm_angle(x)
ex_angle(x)

lapply(ex_between('She said, "I am!" and he responded..."Am what?".', 
    left='"', right='"'), "[", c(TRUE, FALSE))

qdapRegex documentation built on April 4, 2025, 4:35 a.m.