branch: Create and recombine variables based on a branching factor
In leeper/mcode: Functions to Merge and Recode Across Multiple Variables

Description Usage Arguments Details Value See Also Examples

View source: R/branch.R

Create and recombine variables from a source variable and a branching factor (or list of branching factors)

branch(x, f = x, .fill = 0)

unbranch(..., .ignore = 0, .fill = 0, .factors = c("character",
  "numeric"))

`x`	A vector containing values to be divided into new variables.
`f`	A factor in the sense that `as.factor(f)` defines the grouping, or a list of such factors in which case their `interaction` is used for the grouping.
`.fill`	A single value to use to fill in missing values in the resulting branched variables.
`...`	Two or more vectors of equal length, which are to be combined into one new vector. If any two vectors have values at the same index that are not specified in `.ignore`, the function will report an error. It is also possible to pass one or more data frames and/or matrices (which will be coerced to a list of column vectors).
`.ignore`	A (potentially multi-item) vector of values to ignore when merging across the vectors in `...`.
`.factors`	A character string indicating whether to treatment factors in `...` as character (the default) or numeric.

These functions can be used to create dummy variables from a source variable, or to create multiple new variables from an existing variable where the values are mutually exclusive. This is useful when, for example, a survey involves two forms, with each form coded as separate variables that need to be merged together or, conversely, where a single variable contains data for different groups that need to be analyzed separately.

For branch, a matrix of the same number of rows as length(x) and number of columns equal to the number of levels in f.

For unbranch, a vector of length equal to all of ..., which replaces missing values in the input vectors with the corresponding non-missing value from any other vector. If all vector items at a given position are in the .ignore set, the result vector at that index is .fill.

mergeNA

# branch a vector in a matrix and unbranch the result
a <- sample(1:5, 20, TRUE)
b1 <- sample(1:2, 20, TRUE)
b2 <- sample(1:2, 20, TRUE)
branch(a, b1) # 2-column matrix
branch(a, list(b1, b2)) # 4-column matrix

# unbranch from a `branch` matrix
b <- branch(a, list(b1, b2))
u <- unbranch(b)
all.equal(a, u)

# unbranch multiple vectors with `NA` values
x <- c(NA,2,3,NA,NA,6,NA,NA,NA,10)
y <- c(NA,NA,NA,14,NA,NA,17,18,19,NA)
z <- c(NA,NA,NA,NA,25,NA,NA,NA,NA,NA)
unbranch(x,y, .ignore = NA)
unbranch(x,z, .ignore = NA)
unbranch(x,y,z, .ignore = NA)
# equivalent to `mergeNA`
mergeNA(x,y,z)

# unbranch multiple vectors with multiple `.ignore` values
m1 <- c(1,3,4,5,2)
m2 <- c(0,2,2,2,4)
unbranch(m1, m2, .ignore = c(1,2,3))
unbranch(m1, m2, .ignore = c(1,2,3), .fill = NA)

## Not run: 
  # fails for non-mutual exclusive missingness
  w <- c(NA,42,43,NA,25,NA,NA,NA,NA,NA)
  unbranch(x, w, .fill = NA) 

## End(Not run)