ss_help: Split strings and select/check for elements

View source: R/ss_funs.R

ss_helpR Documentation

Split strings and select/check for elements

Description

Order of operations

All functions in this family follow the same order of operations based on processing specs in d, trm, sqz, n, and u in performing string splitting:

  • Atomize ..., collapsing it to a simple atomic vector.

  • Coerce the result to a character vector.

  • Split each element of the vector along the delimiter(s) in d, producing a potentially longer character vector.

  • If n is not NULL, extract the n-th elements(s) from the result.

  • If trm = TRUE, trim white space (i.e., spaces, tabs, newlines) from both ends of each element of the result.

  • If sqz = TRUE, remove leading and trailing white space and replace any multi character interior white-space sequences inside the result with a single space.

  • If u = TRUE, reduce the result to unique values.

Usage

ss_help()

ss(d, ..., trm = TRUE, sqz = TRUE, drop = TRUE, u = FALSE, n = NULL)

sstb(
  d,
  ...,
  name = "string",
  part = "part",
  trm = TRUE,
  sqz = TRUE,
  drop = TRUE,
  u = FALSE,
  n = NULL
)

ss0(..., trm = TRUE, sqz = TRUE, drop = TRUE, n = NULL, u = FALSE)

chars(..., trm = TRUE, sqz = TRUE, drop = TRUE, n = NULL, u = FALSE)

ch(..., trm = TRUE, sqz = TRUE, drop = TRUE, n = NULL, u = FALSE)

ss1(..., trm = TRUE, sqz = TRUE, drop = TRUE, n = NULL, u = FALSE)

ssp(..., trm = TRUE, sqz = TRUE, drop = TRUE, n = NULL, u = FALSE)

ssd(..., trm = TRUE, sqz = TRUE, drop = TRUE, n = NULL, u = FALSE)

ssb(..., trm = TRUE, sqz = TRUE, drop = TRUE, n = NULL, u = FALSE)

sspd(..., trm = TRUE, sqz = TRUE, drop = TRUE, n = NULL, u = FALSE)

sspb(..., trm = TRUE, sqz = TRUE, drop = TRUE, n = NULL, u = FALSE)

ssdb(..., trm = TRUE, sqz = TRUE, drop = TRUE, n = NULL, u = FALSE)

sspdb(..., trm = TRUE, sqz = TRUE, drop = TRUE, n = NULL, u = FALSE)

uss(d, ..., trm = TRUE, sqz = TRUE, drop = TRUE, n = NULL)

uss0(..., trm = TRUE, sqz = TRUE, drop = TRUE, n = NULL)

uchars(..., trm = TRUE, sqz = TRUE, drop = TRUE, n = NULL)

uch(..., trm = TRUE, sqz = TRUE, drop = TRUE, n = NULL)

usstb(
  d,
  ...,
  name = "string",
  part = "part",
  trm = TRUE,
  sqz = TRUE,
  drop = TRUE,
  n = NULL
)

uss1(..., trm = TRUE, sqz = TRUE, drop = TRUE, n = NULL)

ussp(..., trm = TRUE, sqz = TRUE, drop = TRUE, n = NULL)

ussd(..., trm = TRUE, sqz = TRUE, drop = TRUE, n = NULL)

ussb(..., trm = TRUE, sqz = TRUE, drop = TRUE, n = NULL)

usspd(..., trm = TRUE, sqz = TRUE, drop = TRUE, n = NULL)

usspb(..., trm = TRUE, sqz = TRUE, drop = TRUE, n = NULL)

ussdb(..., trm = TRUE, sqz = TRUE, drop = TRUE, n = NULL)

usspdb(..., trm = TRUE, sqz = TRUE, drop = TRUE, n = NULL)

Arguments

d

A complete character vec of a delimiter or delimiters to use in splitting strings.

...

An arbitrary number of objects to be atomized before splitting.

trm

TRUE or FALSE indicating whether to trim white space from each side of each element of the result.

sqz

TRUE or FALSE indicating whether to squeeze the result by removing extra internal whitespace.

drop

TRUE or FALSE. For functions associated character-wise splitting (i.e., ending in 0 or ch), indicates whether to drop resulting values that are not letters, digits, or spaces. For all others, indicates whether to drop resulting blank string values.

u

complete non-NA scalar indicating whether to reduce the result to unique values.

n

An optional complete positive whole-number vec specifying one or more elements to be extracted from the result.

name

A complete character scalar name of the variable to hold the original strings.

part

A complete character scalar prefix for labeling components of vectors resulting from split strings.

x

A character vec of string(s) to be split.

Functions

  • ss(): Splits strings in ... along the delimiter d subject to processing specs in trm, sqz, drop, u, and n. Returns a character vector.

  • sstb(): Requires that splitting each element of x along the delimiter d and post-processing the results based on optional args trm, sqz, drop, n, and u will result in vectors of the same length. Those resulting same-length vectors are then placed into a data.frame. For example, the following console excerpt demonstrates a call to ss_tb and its result.

    > sstb('|', 'a|b|c|d', 'e|f|g|h', 'i|j|k|l', 'm|n|o|p', name = 'original', part = 'letter')
    
      original letter.1 letter.2 letter.3 letter.4
    1  a|b|c|d        a        b        c        d
    2  e|f|g|h        e        f        g        h
    3  i|j|k|l        i        j        k        l
    4  m|n|o|p        m        n        o        p
    
  • ss0(): Split strings in ... into constituent characters subject to processing specs in trm, sqz, drop, n, and u. Returns a character vector composed of single characters.

  • chars(): An alias for ss0(.)

  • ch(): An alias for ss0(.)

  • ss1(): Splits strings in ... using a space (' ') as a delimiter, subject to processing specs in trm, sqz, drop, n, and u. Returns a character vector.

  • ssp(): Splits strings in ... using a pipe ('|') as a delimiter, subject to processing specs in trm, sqz, drop, n, and u. Returns a character vector.

  • ssd(): Splits strings in ... using a dot/period ('.') as a delimiter, subject to processing specs in trm, sqz, drop, n, and u. Returns a character vector.

  • ssb(): Splits strings in ... using a broken pipe ('¦') as a delimiter, subject to processing specs in trm, sqz, drop, n, and u. Returns a character vector.

  • sspd(): Splits strings in ... using both pipes ('|') and dots/periods ('.') as delimiters, subject to processing specs in trm, sqz, drop, n, and u. Returns a character vector.

  • sspb(): Splits strings in ... using both pipes ('|') and broken pipe ('¦') as delimiters, subject to processing specs in trm, sqz, drop, n, and u. Returns a character vector.

  • ssdb(): Splits strings in ... using both dots/periods ('.') and broken pipe ('¦') as delimiters, subject to processing specs in trm, sqz, drop, n, and u. Returns a character vector.

  • sspdb(): Splits strings in ... using pipes ('|'), dots/periods ('.'), and broken pipe ('¦') as delimiters, subject to processing specs in trm, sqz, drop, n, and u. Returns a character vector.

  • uss(): Splits strings in ... using the delimiter d and returns the unique values with pre-processing subject to specs in trm, sqz, and drop and post-processing subject to n. Returns a character vector.

  • uss0(): Splits strings in ... into constituent character and returns the unique values with pre-processing subject to specs in trm, sqz, and drop and post-processing subject to n. Returns a character vector.

  • uchars(): An alias for uss0.

  • uch(): An alias for uss0.

  • usstb(): Calls sstb and returns only unique rows of the result, with pre-processing subject to specs in trm, sqz, and drop and post-processing subject to n. Returns a data.frame.

  • uss1(): Splits strings in ... using a space (' ') as delimiter and returns the unique values with pre-processing subject to specs in trm, sqz, and drop and post-processing subject to n. Returns a character vector.

  • ussp(): Splits strings in ... using a pipe ('|') as delimiter and returns the unique values with pre-processing subject to specs in trm, sqz, and drop and post-processing subject to n. Returns a character vector.

  • ussd(): Splits strings in ... using a dot/period ('.') as delimiter and returns the unique values with pre-processing subject to specs in trm, sqz, and drop and post-processing subject to n. Returns a character vector.

  • ussb(): Splits strings in ... using a broken pipe ('¦') as delimiter and returns the unique values with pre-processing subject to specs in trm, sqz, and drop and post-processing subject to n. Returns a character vector.

  • usspd(): Splits strings in ... using pipes ('|') and dots/periods ('.') as delimiters and returns the unique values with pre-processing subject to specs in trm, sqz, and drop and post-processing subject to n. Returns a character vector.

  • usspb(): Splits strings in ... using pipes ('|') and broken pipes ('¦') as delimiters and returns the unique values with pre-processing subject to specs in trm, sqz, and drop and post-processing subject to n. Returns a character vector.

  • ussdb(): Splits strings in ... using dots/periods ('.') and broken pipes ('¦') as delimiters and returns the unique values with pre-processing subject to specs in trm, sqz, and drop and post-processing subject to n. Returns a character vector.

  • usspdb(): Splits strings in ... using pipes ('|'), dots/periods ('.'), and broken pipes ('¦') as delimiters and returns the unique values with pre-processing subject to specs in trm, sqz, and drop and post-processing subject to n. Returns a character vector.

See Also

Other strings: blank(), chn(), delim(), fsub(), gr, ipat(), makestr(), markdown_help(), maxnch(), ox(), ox_vals(), pgrid_help(), revstr(), spaces(), tocase(), weave()

Other chars: chn(), maxnch(), revstr(), spaces()

Examples

ss("", "super-cooled")
ss("|", "super||cooled", "super|heated")
ss0("super-cooled", "super-heated")
ss1("super cooled", "super heated")
ssp("super|cooled", "super|heated", u = TRUE)
ssd("super.cooled", "super.heated")
ssb("super¦cooled", "super¦heated")
sspd("super|cooled", "super.heated")
sspb("super|cooled", "super¦heated")
ssdb("super.cooled", "super¦heated")
sspdb("super|cooled¦|super|heated", u = TRUE)
sspdb(" super|cooled  ¦super..  heated", n = 3)
sspdb(" super|cooled  ¦super..  heated", trm = F, sqz = F, drop = F, n = 3)
uss("", "super-cooled")
uss("|", "super|cooled", "super|heated")
uss0("super-cooled", "super-heated")
uss1("super cooled", "super heated")
ussp("super|cooled", "super|heated")
ussp("super|cooled", "super|heated")
ussd("super.cooled", "super.heated")
ussb("super¦cooled", "super¦heated")
usspd("super|cooled", "super.heated")
usspb("super|cooled", "super¦heated")
ussdb("super.cooled", "super¦heated")
usspdb("super|cooled¦super|heated")
sstb("|", 'a|b|c|d', 'e|f|g|h', 'i|j|k|l', 'm|n|o|p')
sstb("|", 'a|b|c|d', 'e|f|g|h', 'i|j|k|l', 'm|n|o|p', name = 'original', part = 'letter')

j-martineau/uj documentation built on Sept. 14, 2024, 4:40 a.m.