rbind_labelled: Combine data frames with columns of class 'labelled'

View source: R/rbind_labelled.R

rbind_labelledR Documentation

Combine data frames with columns of class 'labelled'

Description

Combine data frames with columns of class 'labelled'

Usage

rbind_labelled(..., labels = NULL, warn = TRUE)

Arguments

...

data frames to bind together, potentially with columns of class "labelled". The first argument can be a list of data frames, similar to 'plyr::rbind.fill'.

labels

A named list providing vectors of value labels or describing how to handle columns of class 'labelled'. See details for usage.

warn

Logical indicating to warn if combining variables with different value labels. Defaults to TRUE.

Details

The argument 'labels' provides options for how to handle binding of columns of class 'labelled'. Typical use is to provide a named list with elements for each labelled column. Elements of the list are either a vector of labels that should be applied to the column or the character string "concatenated", which indicates that labels should be concatenated such that all unique labels are distinct values in the combined vector. This is accomplished by converting to character strings, binding, and then casting back to labelled. For labelled columns for which labels are not provided in the 'label' argument, the default behaviour is that the labels from the first data frame with labels for that column are inherited by the combined data.

See examples.

Value

A data frame.

Examples

df1 <- data.frame(
area = haven::labelled(c(1L, 2L, 3L), c("reg 1"=1,"reg 2"=2,"reg 3"=3)),
climate = haven::labelled(c(0L, 1L, 1L), c("cold"=0,"hot"=1))
)
df2 <- data.frame(
area    = haven::labelled(c(1L, 2L), c("reg A"=1, "reg B"=2)),
climate = haven::labelled(c(1L, 0L), c("cold"=0, "warm"=1))
)

# Default: all data frames inherit labels from first df. Incorrect if
# "reg 1" and "reg A" are from different countries, for example.
dfA <- rbind_labelled(df1, df2)
haven::as_factor(dfA)

# Concatenate value labels for "area". Regions are coded separately,
# and original integer values are lost (by necessity of more levels now).
# For "climate", codes "1 = hot" and "1 = warm", are coded as the same
# outcome, inheriting "1 = hot" from df1 by default.
dfB <- rbind_labelled(df1, df2, labels=list(area = "concatenate"))
dfB
haven::as_factor(dfB)

# We can specify to code as "1=warm/hot" rather than inheriting "hot".
dfC <- rbind_labelled(df1, df2,
labels=list(area = "concatenate", climate = c("cold"=0, "warm/hot"=1)))

dfC$climate
haven::as_factor(dfC)

# Or use `climate="concatenate"` to code "warm" and "hot" as different.
dfD <- rbind_labelled(df1, df2,
labels=list(area = "concatenate", climate="concatenate"))

dfD
haven::as_factor(dfD)


ropensci/rdhs documentation built on April 5, 2024, 11:50 a.m.