Remove/Replace/Extract Brackets

Description

Remove/replace/extract bracketed strings.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
rm_bracket(text.var, pattern = "all", trim = TRUE, clean = TRUE,
  replacement = "", extract = FALSE, include.markers = ifelse(extract,
  FALSE, TRUE), dictionary = getOption("regex.library"), ...)

rm_round(text.var, pattern = "(", trim = TRUE, clean = TRUE,
  replacement = "", extract = FALSE, include.markers = ifelse(extract,
  FALSE, TRUE), dictionary = getOption("regex.library"), ...)

rm_square(text.var, pattern = "[", trim = TRUE, clean = TRUE,
  replacement = "", extract = FALSE, include.markers = ifelse(extract,
  FALSE, TRUE), dictionary = getOption("regex.library"), ...)

rm_curly(text.var, pattern = "{", trim = TRUE, clean = TRUE,
  replacement = "", extract = FALSE, include.markers = ifelse(extract,
  FALSE, TRUE), dictionary = getOption("regex.library"), ...)

rm_angle(text.var, pattern = "<", trim = TRUE, clean = TRUE,
  replacement = "", extract = FALSE, include.markers = ifelse(extract,
  FALSE, TRUE), dictionary = getOption("regex.library"), ...)

rm_bracket_multiple(text.var, trim = TRUE, clean = TRUE, pattern = "all",
  replacement = "", extract = FALSE, include.markers = FALSE,
  merge = TRUE)

ex_bracket(text.var, pattern = "all", trim = TRUE, clean = TRUE,
  replacement = "", extract = TRUE, include.markers = ifelse(extract,
  FALSE, TRUE), dictionary = getOption("regex.library"), ...)

ex_bracket_multiple(text.var, trim = TRUE, clean = TRUE, pattern = "all",
  replacement = "", extract = TRUE, include.markers = FALSE,
  merge = TRUE)

ex_angle(text.var, pattern = "<", trim = TRUE, clean = TRUE,
  replacement = "", extract = TRUE, include.markers = ifelse(extract,
  FALSE, TRUE), dictionary = getOption("regex.library"), ...)

ex_round(text.var, pattern = "(", trim = TRUE, clean = TRUE,
  replacement = "", extract = TRUE, include.markers = ifelse(extract,
  FALSE, TRUE), dictionary = getOption("regex.library"), ...)

ex_square(text.var, pattern = "[", trim = TRUE, clean = TRUE,
  replacement = "", extract = TRUE, include.markers = ifelse(extract,
  FALSE, TRUE), dictionary = getOption("regex.library"), ...)

ex_curly(text.var, pattern = "{", trim = TRUE, clean = TRUE,
  replacement = "", extract = TRUE, include.markers = ifelse(extract,
  FALSE, TRUE), dictionary = getOption("regex.library"), ...)

Arguments

text.var

The text variable.

pattern

The type of bracket (and encased text) to remove. This is one or more of the strings "curly"/"{", "square"/"[", "round"/"(", "angle"/"<" and "all". These strings correspond to: {, [, (, < or all four types.

trim

logical. If TRUE removes leading and trailing white spaces.

clean

trim logical. If TRUE extra white spaces and escaped character will be removed.

replacement

Replacement for matched pattern.

extract

logical. If TRUE the bracketed text is extracted into a list of vectors.

include.markers

logical. If TRUE and extract = TRUE returns the markers (left/right) and the text between.

dictionary

A dictionary of canned regular expressions to search within if pattern begins with "@rm_".

merge

logical. If TRUE the results of each bracket type will be merged by string. FALSE returns a named list of lists of vectors of bracketed text per bracket type.

...

Other arguments passed to gsub.

Value

rm_bracket - returns a character string with multiple brackets removed. If extract = TRUE the results are optionally merged and named by bracket type. This is more flexible than rm_bracket but slower.

rm_round - returns a character string with round brackets removed.

rm_square - returns a character string with square brackets removed.

rm_curly - returns a character string with curly brackets removed.

rm_angle - returns a character string with angle brackets removed.

rm_bracket_multiple - returns a character string with multiple brackets removed. If extract = TRUE the results are optionally merged and named by bracket type. This is more flexible than rm_bracket but slower.

Author(s)

Martin Morgan and Tyler Rinker <tyler.rinker@gmail.com>.

References

http://stackoverflow.com/q/8621066/1000343

See Also

gsub, rm_between, stri_extract_all_regex

Other rm_.functions: as_numeric, as_numeric2, ex_number, rm_number; as_time, as_time2, ex_time, ex_transcript_time, rm_time, rm_transcript_time; ex_abbreviation, rm_abbreviation; ex_between, ex_between_multiple, rm_between, rm_between_multiple; ex_caps_phrase, rm_caps_phrase; ex_caps, rm_caps; ex_citation_tex, rm_citation_tex; ex_citation, rm_citation; ex_city_state_zip, rm_city_state_zip; ex_city_state, rm_city_state; ex_date, rm_date; ex_default, rm_default; ex_dollar, rm_dollar; ex_email, rm_email; ex_emoticon, rm_emoticon; ex_endmark, rm_endmark; ex_hash, rm_hash; ex_nchar_words, rm_nchar_words; ex_non_ascii, rm_non_ascii; ex_non_words, rm_non_words; ex_percent, rm_percent; ex_phone, rm_phone; ex_postal_code, rm_postal_code; ex_repeated_characters, rm_repeated_characters; ex_repeated_phrases, rm_repeated_phrases; ex_repeated_words, rm_repeated_words; ex_tag, rm_tag; ex_title_name, rm_title_name; ex_twitter_url, ex_url, rm_twitter_url, rm_url; ex_white, ex_white_bracket, ex_white_colon, ex_white_comma, ex_white_endmark, ex_white_lead, ex_white_lead_trail, ex_white_multiple, ex_white_punctuation, ex_white_trail, rm_white, rm_white_bracket, rm_white_colon, rm_white_comma, rm_white_endmark, rm_white_lead, rm_white_lead_trail, rm_white_multiple, rm_white_punctuation, rm_white_trail; ex_zip, rm_zip

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
examp <- structure(list(person = structure(c(1L, 2L, 1L, 3L),
    .Label = c("bob", "greg", "sue"), class = "factor"), text =
    c("I love chicken [unintelligible]!",
    "Me too! (laughter) It's so good.[interrupting]",
    "Yep it's awesome {reading}.", "Agreed. {is so much fun}")), .Names =
    c("person", "text"), row.names = c(NA, -4L), class = "data.frame")

examp
rm_bracket(examp$text, pattern = "square")
rm_bracket(examp$text, pattern = "curly")
rm_bracket(examp$text, pattern = c("square", "round"))
rm_bracket(examp$text)

ex_bracket(examp$text, pattern = "square")
ex_bracket(examp$text, pattern = "curly")
ex_bracket(examp$text, pattern = c("square", "round"))
ex_bracket(examp$text, pattern = c("square", "round"), merge = FALSE)
ex_bracket(examp$text)
ex_bracket(examp$tex, include.markers=TRUE)

## Not run: 
library(qdap)
ex_bracket(examp$tex, pattern="curly") %>%
  unlist() %>%
  na.omit() %>%
  paste2()

## End(Not run)

x <- "I like [bots] (not). And <likely> many do not {he he}"

rm_round(x)
ex_round(x)
ex_round(x, include.marker = TRUE)

rm_square(x)
ex_square(x)

rm_curly(x)
ex_curly(x)

rm_angle(x)
ex_angle(x)

lapply(ex_between('She said, "I am!" and he responded..."Am what?".',
    left='"', right='"'), "[", c(TRUE, FALSE))

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.