branch: Create and recombine variables based on a branching factor

Description Usage Arguments Details Value See Also Examples

View source: R/branch.R

Description

Create and recombine variables from a source variable and a branching factor (or list of branching factors)

Usage

1
2
3
4
branch(x, f = x, .fill = 0)

unbranch(..., .ignore = 0, .fill = 0, .factors = c("character",
  "numeric"))

Arguments

x

A vector containing values to be divided into new variables.

f

A factor in the sense that as.factor(f) defines the grouping, or a list of such factors in which case their interaction is used for the grouping.

.fill

A single value to use to fill in missing values in the resulting branched variables.

...

Two or more vectors of equal length, which are to be combined into one new vector. If any two vectors have values at the same index that are not specified in .ignore, the function will report an error. It is also possible to pass one or more data frames and/or matrices (which will be coerced to a list of column vectors).

.ignore

A (potentially multi-item) vector of values to ignore when merging across the vectors in ....

.factors

A character string indicating whether to treatment factors in ... as character (the default) or numeric.

Details

These functions can be used to create dummy variables from a source variable, or to create multiple new variables from an existing variable where the values are mutually exclusive. This is useful when, for example, a survey involves two forms, with each form coded as separate variables that need to be merged together or, conversely, where a single variable contains data for different groups that need to be analyzed separately.

Value

For branch, a matrix of the same number of rows as length(x) and number of columns equal to the number of levels in f.

For unbranch, a vector of length equal to all of ..., which replaces missing values in the input vectors with the corresponding non-missing value from any other vector. If all vector items at a given position are in the .ignore set, the result vector at that index is .fill.

See Also

mergeNA

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
# branch a vector in a matrix and unbranch the result
a <- sample(1:5, 20, TRUE)
b1 <- sample(1:2, 20, TRUE)
b2 <- sample(1:2, 20, TRUE)
branch(a, b1) # 2-column matrix
branch(a, list(b1, b2)) # 4-column matrix

# unbranch from a `branch` matrix
b <- branch(a, list(b1, b2))
u <- unbranch(b)
all.equal(a, u)

# unbranch multiple vectors with `NA` values
x <- c(NA,2,3,NA,NA,6,NA,NA,NA,10)
y <- c(NA,NA,NA,14,NA,NA,17,18,19,NA)
z <- c(NA,NA,NA,NA,25,NA,NA,NA,NA,NA)
unbranch(x,y, .ignore = NA)
unbranch(x,z, .ignore = NA)
unbranch(x,y,z, .ignore = NA)
# equivalent to `mergeNA`
mergeNA(x,y,z)

# unbranch multiple vectors with multiple `.ignore` values
m1 <- c(1,3,4,5,2)
m2 <- c(0,2,2,2,4)
unbranch(m1, m2, .ignore = c(1,2,3))
unbranch(m1, m2, .ignore = c(1,2,3), .fill = NA)

## Not run: 
  # fails for non-mutual exclusive missingness
  w <- c(NA,42,43,NA,25,NA,NA,NA,NA,NA)
  unbranch(x, w, .fill = NA) 

## End(Not run)

leeper/mcode documentation built on May 21, 2019, 12:37 a.m.