split_by_parens: Split columns by parentheses, brackets, braces, or similar
In scrutiny: Error Detection in Science

split_by_parens

R Documentation

Split columns by parentheses, brackets, braces, or similar

Description

Summary statistics are often presented like "2.65 (0.27)". When working with tables copied into R, it can be tedious to separate values before and inside parentheses. split_by_parens() does this automatically.

By default, it operates on all columns. Output can optionally be pivoted into a longer format by setting transform to TRUE.

Choose separators other than parentheses with the sep argument.

Usage

split_by_parens(
  data,
  cols = everything(),
  check_sep = TRUE,
  keep = FALSE,
  transform = FALSE,
  sep = "parens",
  end1 = "x",
  end2 = "sd",
  ...
)

Arguments

`data`	Data frame.
`cols`	Select columns from `data` using tidyselect. Default is `everything()`, which selects all columns that pass `check_sep`.
`check_sep`	Logical. If `TRUE` (the default), columns are excluded if they don't contain the `sep` elements.
`keep`	Logical. If set to `TRUE`, the originally selected columns that were split by the function also appear in the output. Default is `FALSE`.
`transform`	Logical. If set to `TRUE`, the output will be pivoted to be better suitable for typical follow-up tasks. Default is `FALSE`.
`sep`	String. What to split by. Either `"parens"`, `"brackets"`, or `"braces"`; or a length-2 vector of custom separators (see Examples). Default is `"parens"`.
`end1`, `end2`	Strings. Endings of the two column names that result from splitting a column. Default is `"x"` for `end1` and `"sd"` for `end2`.
`...`	These dots must be empty.

Value

Data frame.

Examples

# Call `split_by_parens()` on data like these:
df1 <- tibble::tribble(
  ~drone,           ~selfpilot,
  "0.09 (0.21)",    "0.19 (0.13)",
  "0.19 (0.28)",    "0.53 (0.10)",
  "0.62 (0.16)",    "0.50 (0.11)",
  "0.15 (0.35)",    "0.57 (0.16)",
)

# Basic usage:
df1 %>%
  split_by_parens()

# Name specific columns with `cols` to only split those:
df1 %>%
  split_by_parens(cols = drone)

# Pivot the data into a longer format
# by setting `transform` to `TRUE`:
df1 %>%
  split_by_parens(transform = TRUE)

# Choose different column names or
# name suffixes with `end1` and `end2`:
df1 %>%
  split_by_parens(end1 = "beta", end2 = "se")

df1 %>%
  split_by_parens(
    transform = TRUE,
    end1 = "beta", end2 = "se"
  )

# With a different separator...
df2 <- tibble::tribble(
  ~drone,           ~selfpilot,
  "0.09 [0.21]",    "0.19 [0.13]",
  "0.19 [0.28]",    "0.53 [0.10]",
  "0.62 [0.16]",    "0.50 [0.11]",
  "0.15 [0.35]",    "0.57 [0.16]",
)

# ... specify `sep`:
df2 %>%
  split_by_parens(sep = "brackets")

# (Accordingly with `{}` and `"braces"`.)

# If the separator is yet a different one...
df3 <- tibble::tribble(
  ~drone,           ~selfpilot,
  "0.09 <0.21>",    "0.19 <0.13>",
  "0.19 <0.28>",    "0.53 <0.10>",
  "0.62 <0.16>",    "0.50 <0.11>",
  "0.15 <0.35>",    "0.57 <0.16>",
)

# ... `sep` should be a length-2 vector
# that contains the separating elements:
df3 %>%
  split_by_parens(sep = c("<", ">"))

scrutiny documentation built on Sept. 22, 2024, 9:06 a.m.