split_by_parens: Split columns by parentheses, brackets, braces, or similar

View source: R/split-by-parens.R

split_by_parensR Documentation

Split columns by parentheses, brackets, braces, or similar

Description

Summary statistics are often presented like "2.65 (0.27)". When working with tables copied into R, it can be tedious to separate values before and inside parentheses. split_by_parens() does this automatically.

By default, it operates on all columns. Output can optionally be pivoted into a longer format by setting transform to TRUE.

Choose separators other than parentheses with the sep argument.

Usage

split_by_parens(
  data,
  cols = everything(),
  check_sep = TRUE,
  keep = FALSE,
  transform = FALSE,
  sep = "parens",
  end1 = "x",
  end2 = "sd",
  ...
)

Arguments

data

Data frame.

cols

Select columns from data using tidyselect. Default is everything(), which selects all columns that pass check_sep.

check_sep

Logical. If TRUE (the default), columns are excluded if they don't contain the sep elements.

keep

Logical. If set to TRUE, the originally selected columns that were split by the function also appear in the output. Default is FALSE.

transform

Logical. If set to TRUE, the output will be pivoted to be better suitable for typical follow-up tasks. Default is FALSE.

sep

String. What to split by. Either "parens", "brackets", or "braces"; or a length-2 vector of custom separators (see Examples). Default is "parens".

end1, end2

Strings. Endings of the two column names that result from splitting a column. Default is "x" for end1 and "sd" for end2.

...

These dots must be empty.

Value

Data frame.

See Also

  • before_parens() and inside_parens() take a string vector and extract values from the respective position.

  • dplyr::across() powers the application of the two above functions within split_by_parens(), including the creation of new columns.

  • tidyr::separate_wider_delim() is a more general function, but it does not recognize closing elements such as closed parentheses.

Examples

# Call `split_by_parens()` on data like these:
df1 <- tibble::tribble(
  ~drone,           ~selfpilot,
  "0.09 (0.21)",    "0.19 (0.13)",
  "0.19 (0.28)",    "0.53 (0.10)",
  "0.62 (0.16)",    "0.50 (0.11)",
  "0.15 (0.35)",    "0.57 (0.16)",
)

# Basic usage:
df1 %>%
  split_by_parens()

# Name specific columns with `cols` to only split those:
df1 %>%
  split_by_parens(cols = drone)

# Pivot the data into a longer format
# by setting `transform` to `TRUE`:
df1 %>%
  split_by_parens(transform = TRUE)

# Choose different column names or
# name suffixes with `end1` and `end2`:
df1 %>%
  split_by_parens(end1 = "beta", end2 = "se")

df1 %>%
  split_by_parens(
    transform = TRUE,
    end1 = "beta", end2 = "se"
  )

# With a different separator...
df2 <- tibble::tribble(
  ~drone,           ~selfpilot,
  "0.09 [0.21]",    "0.19 [0.13]",
  "0.19 [0.28]",    "0.53 [0.10]",
  "0.62 [0.16]",    "0.50 [0.11]",
  "0.15 [0.35]",    "0.57 [0.16]",
)

# ... specify `sep`:
df2 %>%
  split_by_parens(sep = "brackets")

# (Accordingly with `{}` and `"braces"`.)

# If the separator is yet a different one...
df3 <- tibble::tribble(
  ~drone,           ~selfpilot,
  "0.09 <0.21>",    "0.19 <0.13>",
  "0.19 <0.28>",    "0.53 <0.10>",
  "0.62 <0.16>",    "0.50 <0.11>",
  "0.15 <0.35>",    "0.57 <0.16>",
)

# ... `sep` should be a length-2 vector
# that contains the separating elements:
df3 %>%
  split_by_parens(sep = c("<", ">"))

scrutiny documentation built on Sept. 22, 2024, 9:06 a.m.