split_fill: Split and fill a chr vector
In slin30/wzMisc: Miscellaneous functions by WZ

split_fill

R Documentation

Split and fill a chr vector

Description

Split a chr vector based on sep, return melted DT by ID

Usage

split_fill(dat, targ, split_on, IDcol, rebind = FALSE, keep.targ = FALSE, ...)

Arguments

`dat`	a data.table
`targ`	chr; vector of length 1 denoting column that contains data to be split
`split_on`	chr; what pattern should be used to perform the split?
`IDcol`	chr; vector of length 1 denoting the column in dat containing the ID to be used for melting
`rebind`	logical; should the original columns be appended back to the output? Defaults to `FALSE`
`keep.targ`	logical; only used if rebind = `TRUE`; drop the column that was split on?
`...`	Other (prefereably named) arguments to pass on to `strsplit` aside from split

Details

This is a convenience-convenience (not a typo) wrapper around data.table::tstrsplit, taking advantage of the performance of data.table::Ctranspose, and adding faculties to melt and rejoin selectively.

Value

A melted data.table using IDcol as id.var for melt.data.table, with targ splitted by split_on.

If rebind == TRUE, will also return the original columns, with a single IDcol as denoted in input. This is performed via a data.table ad-hoc join, using IDcol in j. The input targ column will be returned as well, if keep.targ is TRUE.

Note

targ currently is limited to a vector of length 1, as is IDcol. This is likely to change in the future, to make this function more flexible and consistent with the capabilities of melt.data.table.

Use ... to pass e.g. fixed = TRUE or perl = TRUE to strsplit. See documentation for tstrsplit.

Examples

library(data.table)
dt <- data.table(
  ID = 1:10,
  targ = sapply(1:10, function(f)
    paste0(
      LETTERS[1:5],
      f,
      collapse = "|"
    )
  )
)
head(split_fill(dat = dt, targ = "targ", split_on = "\\|", IDcol = "ID"))

#Demonstrating rebind
dt[, targ_additional := targ]
head(split_fill(dat = dt, targ = "targ", split_on = "\\|", IDcol = "ID", rebind = TRUE))

slin30/wzMisc documentation built on Jan. 27, 2023, 1 a.m.