uj: Utilities by JAM

ss_help

R Documentation

Split strings and select/check for elements

Description

Order of operations

All functions in this family follow the same order of operations based on processing specs in d, trm, sqz, n, and u in performing string splitting:

Atomize ..., collapsing it to a simple atomic vector.
Coerce the result to a character vector.
Split each element of the vector along the delimiter(s) in d, producing a potentially longer character vector.
If n is not NULL, extract the n-th elements(s) from the result.
If trm = TRUE, trim white space (i.e., spaces, tabs, newlines) from both ends of each element of the result.
If sqz = TRUE, remove leading and trailing white space and replace any multi character interior white-space sequences inside the result with a single space.
If u = TRUE, reduce the result to unique values.

Usage

ss_help()

ss(d, ..., trm = TRUE, sqz = TRUE, drop = TRUE, u = FALSE, n = NULL)

sstb(
  d,
  ...,
  name = "string",
  part = "part",
  trm = TRUE,
  sqz = TRUE,
  drop = TRUE,
  u = FALSE,
  n = NULL
)

ss0(..., trm = TRUE, sqz = TRUE, drop = TRUE, n = NULL, u = FALSE)

chars(..., trm = TRUE, sqz = TRUE, drop = TRUE, n = NULL, u = FALSE)

ch(..., trm = TRUE, sqz = TRUE, drop = TRUE, n = NULL, u = FALSE)

ss1(..., trm = TRUE, sqz = TRUE, drop = TRUE, n = NULL, u = FALSE)

ssp(..., trm = TRUE, sqz = TRUE, drop = TRUE, n = NULL, u = FALSE)

ssd(..., trm = TRUE, sqz = TRUE, drop = TRUE, n = NULL, u = FALSE)

ssb(..., trm = TRUE, sqz = TRUE, drop = TRUE, n = NULL, u = FALSE)

sspd(..., trm = TRUE, sqz = TRUE, drop = TRUE, n = NULL, u = FALSE)

sspb(..., trm = TRUE, sqz = TRUE, drop = TRUE, n = NULL, u = FALSE)

ssdb(..., trm = TRUE, sqz = TRUE, drop = TRUE, n = NULL, u = FALSE)

sspdb(..., trm = TRUE, sqz = TRUE, drop = TRUE, n = NULL, u = FALSE)

uss(d, ..., trm = TRUE, sqz = TRUE, drop = TRUE, n = NULL)

uss0(..., trm = TRUE, sqz = TRUE, drop = TRUE, n = NULL)

uchars(..., trm = TRUE, sqz = TRUE, drop = TRUE, n = NULL)

uch(..., trm = TRUE, sqz = TRUE, drop = TRUE, n = NULL)

usstb(
  d,
  ...,
  name = "string",
  part = "part",
  trm = TRUE,
  sqz = TRUE,
  drop = TRUE,
  n = NULL
)

uss1(..., trm = TRUE, sqz = TRUE, drop = TRUE, n = NULL)

ussp(..., trm = TRUE, sqz = TRUE, drop = TRUE, n = NULL)

ussd(..., trm = TRUE, sqz = TRUE, drop = TRUE, n = NULL)

ussb(..., trm = TRUE, sqz = TRUE, drop = TRUE, n = NULL)

usspd(..., trm = TRUE, sqz = TRUE, drop = TRUE, n = NULL)

usspb(..., trm = TRUE, sqz = TRUE, drop = TRUE, n = NULL)

ussdb(..., trm = TRUE, sqz = TRUE, drop = TRUE, n = NULL)

usspdb(..., trm = TRUE, sqz = TRUE, drop = TRUE, n = NULL)

Arguments

`d`	A complete character vec of a delimiter or delimiters to use in splitting strings.
`...`	An arbitrary number of objects to be atomized before splitting.
`trm`	`TRUE` or `FALSE` indicating whether to trim white space from each side of each element of the result.
`sqz`	`TRUE` or `FALSE` indicating whether to squeeze the result by removing extra internal whitespace.
`drop`	`TRUE` or `FALSE`. For functions associated character-wise splitting (i.e., ending in `0` or `ch`), indicates whether to drop resulting values that are not letters, digits, or spaces. For all others, indicates whether to drop resulting blank string values.
`u`	complete non-`NA` scalar indicating whether to reduce the result to unique values.
`n`	An optional complete positive whole-number vec specifying one or more elements to be extracted from the result.
`name`	A complete character scalar name of the variable to hold the original strings.
`part`	A complete character scalar prefix for labeling components of vectors resulting from split strings.
`x`	A character vec of string(s) to be split.

Functions

ss(): Splits strings in ... along the delimiter d subject to processing specs in trm, sqz, drop, u, and n. Returns a character vector.
sstb(): Requires that splitting each element of x along the delimiter d and post-processing the results based on optional args trm, sqz, drop, n, and u will result in vectors of the same length. Those resulting same-length vectors are then placed into a data.frame. For example, the following console excerpt demonstrates a call to ss_tb and its result.
```
> sstb('|', 'a|b|c|d', 'e|f|g|h', 'i|j|k|l', 'm|n|o|p', name = 'original', part = 'letter')

  original letter.1 letter.2 letter.3 letter.4
1  a|b|c|d        a        b        c        d
2  e|f|g|h        e        f        g        h
3  i|j|k|l        i        j        k        l
4  m|n|o|p        m        n        o        p
```
ss0(): Split strings in ... into constituent characters subject to processing specs in trm, sqz, drop, n, and u. Returns a character vector composed of single characters.
chars(): An alias for ss0(.)
ch(): An alias for ss0(.)
ss1(): Splits strings in ... using a space (' ') as a delimiter, subject to processing specs in trm, sqz, drop, n, and u. Returns a character vector.
ssp(): Splits strings in ... using a pipe ('|') as a delimiter, subject to processing specs in trm, sqz, drop, n, and u. Returns a character vector.
ssd(): Splits strings in ... using a dot/period ('.') as a delimiter, subject to processing specs in trm, sqz, drop, n, and u. Returns a character vector.
ssb(): Splits strings in ... using a broken pipe ('¦') as a delimiter, subject to processing specs in trm, sqz, drop, n, and u. Returns a character vector.
sspd(): Splits strings in ... using both pipes ('|') and dots/periods ('.') as delimiters, subject to processing specs in trm, sqz, drop, n, and u. Returns a character vector.
sspb(): Splits strings in ... using both pipes ('|') and broken pipe ('¦') as delimiters, subject to processing specs in trm, sqz, drop, n, and u. Returns a character vector.
ssdb(): Splits strings in ... using both dots/periods ('.') and broken pipe ('¦') as delimiters, subject to processing specs in trm, sqz, drop, n, and u. Returns a character vector.
sspdb(): Splits strings in ... using pipes ('|'), dots/periods ('.'), and broken pipe ('¦') as delimiters, subject to processing specs in trm, sqz, drop, n, and u. Returns a character vector.
uss(): Splits strings in ... using the delimiter d and returns the unique values with pre-processing subject to specs in trm, sqz, and drop and post-processing subject to n. Returns a character vector.
uss0(): Splits strings in ... into constituent character and returns the unique values with pre-processing subject to specs in trm, sqz, and drop and post-processing subject to n. Returns a character vector.
uchars(): An alias for uss0.
uch(): An alias for uss0.
usstb(): Calls sstb and returns only unique rows of the result, with pre-processing subject to specs in trm, sqz, and drop and post-processing subject to n. Returns a data.frame.
uss1(): Splits strings in ... using a space (' ') as delimiter and returns the unique values with pre-processing subject to specs in trm, sqz, and drop and post-processing subject to n. Returns a character vector.
ussp(): Splits strings in ... using a pipe ('|') as delimiter and returns the unique values with pre-processing subject to specs in trm, sqz, and drop and post-processing subject to n. Returns a character vector.
ussd(): Splits strings in ... using a dot/period ('.') as delimiter and returns the unique values with pre-processing subject to specs in trm, sqz, and drop and post-processing subject to n. Returns a character vector.
ussb(): Splits strings in ... using a broken pipe ('¦') as delimiter and returns the unique values with pre-processing subject to specs in trm, sqz, and drop and post-processing subject to n. Returns a character vector.
usspd(): Splits strings in ... using pipes ('|') and dots/periods ('.') as delimiters and returns the unique values with pre-processing subject to specs in trm, sqz, and drop and post-processing subject to n. Returns a character vector.
usspb(): Splits strings in ... using pipes ('|') and broken pipes ('¦') as delimiters and returns the unique values with pre-processing subject to specs in trm, sqz, and drop and post-processing subject to n. Returns a character vector.
ussdb(): Splits strings in ... using dots/periods ('.') and broken pipes ('¦') as delimiters and returns the unique values with pre-processing subject to specs in trm, sqz, and drop and post-processing subject to n. Returns a character vector.
usspdb(): Splits strings in ... using pipes ('|'), dots/periods ('.'), and broken pipes ('¦') as delimiters and returns the unique values with pre-processing subject to specs in trm, sqz, and drop and post-processing subject to n. Returns a character vector.

Examples

ss("", "super-cooled")
ss("|", "super||cooled", "super|heated")
ss0("super-cooled", "super-heated")
ss1("super cooled", "super heated")
ssp("super|cooled", "super|heated", u = TRUE)
ssd("super.cooled", "super.heated")
ssb("super¦cooled", "super¦heated")
sspd("super|cooled", "super.heated")
sspb("super|cooled", "super¦heated")
ssdb("super.cooled", "super¦heated")
sspdb("super|cooled¦|super|heated", u = TRUE)
sspdb(" super|cooled  ¦super..  heated", n = 3)
sspdb(" super|cooled  ¦super..  heated", trm = F, sqz = F, drop = F, n = 3)
uss("", "super-cooled")
uss("|", "super|cooled", "super|heated")
uss0("super-cooled", "super-heated")
uss1("super cooled", "super heated")
ussp("super|cooled", "super|heated")
ussp("super|cooled", "super|heated")
ussd("super.cooled", "super.heated")
ussb("super¦cooled", "super¦heated")
usspd("super|cooled", "super.heated")
usspb("super|cooled", "super¦heated")
ussdb("super.cooled", "super¦heated")
usspdb("super|cooled¦super|heated")
sstb("|", 'a|b|c|d', 'e|f|g|h', 'i|j|k|l', 'm|n|o|p')
sstb("|", 'a|b|c|d', 'e|f|g|h', 'i|j|k|l', 'm|n|o|p', name = 'original', part = 'letter')

j-martineau/uj documentation built on Sept. 14, 2024, 4:40 a.m.