set_labels: Add value labels to variables

View source: R/set_labels.R

set_labelsR Documentation

Add value labels to variables

Description

This function adds labels as attribute (named "labels") to a variable or vector x, resp. to a set of variables in a data frame or a list-object. A use-case is, for instance, the sjPlot-package, which supports labelled data and automatically assigns labels to axes or legends in plots or to be used in tables. val_labels() is intended for use within pipe-workflows and has a tidyverse-consistent syntax, including support for quasi-quotation (see 'Examples').

Usage

set_labels(
  x,
  ...,
  labels,
  force.labels = FALSE,
  force.values = TRUE,
  drop.na = TRUE
)

val_labels(x, ..., force.labels = FALSE, force.values = TRUE, drop.na = TRUE)

Arguments

x

A vector or data frame.

...

For set_labels(), Optional, unquoted names of variables that should be selected for further processing. Required, if x is a data frame (and no vector) and only selected variables from x should be processed. You may also use functions like : or tidyselect's select-helpers.

For val_labels(), pairs of named vectors, where the name equals the variable name, which should be labelled, and the value is the new variable label. val_labels() also supports quasi-quotation (see 'Examples').

labels

(Named) character vector of labels that will be added to x as "labels" or "value.labels" attribute.

  • if labels is not a named vector, its length must equal the value range of x, i.e. if x has values from 1 to 3, labels should have a length of 3;

  • if length of labels is intended to differ from length of unique values of x, a warning is given. You can still add missing labels with the force.labels or force.values arguments; see 'Note'.

  • if labels is a named vector, value labels will be set accordingly, even if x has a different length of unique values. See 'Note' and 'Examples'.

  • if x is a data frame, labels may also be a list of (named) character vectors;

  • if labels is a list, it must have the same length as number of columns of x;

  • if labels is a vector and x is a data frame, labels will be applied to each column of x.

Use labels = "" to remove labels-attribute from x.

force.labels

Logical; if TRUE, all labels are added as value label attribute, even if x has less unique values then length of labels or if x has a smaller range then length of labels. See 'Examples'. This parameter will be ignored, if labels is a named vector.

force.values

Logical, if TRUE (default) and labels has less elements than unique values of x, additional values not covered by labels will be added as label as well. See 'Examples'. This parameter will be ignored, if labels is a named vector.

drop.na

Logical, whether existing value labels of tagged NA values (see tagged_na) should be removed (drop.na = TRUE, the default) or preserved (drop.na = FALSE). See get_na for more details on tagged NA values.

Value

x with value label attributes; or with removed label-attributes if labels = "". If x is a data frame, the complete data frame x will be returned, with removed or added to variables specified in ...; if ... is not specified, applies to all variables in the data frame.

Note

  • if labels is a named vector, force.labels and force.values will be ignored, and only values defined in labels will be labelled;

  • if x has less unique values than labels, redundant labels will be dropped, see force.labels;

  • if x has more unique values than labels, only matching values will be labelled, other values remain unlabelled, see force.values;

If you only want to change partial value labels, use add_labels instead. Furthermore, see 'Note' in get_labels.

See Also

See vignette Labelled Data and the sjlabelled-Package for more details; set_label to manually set variable labels or get_label to get variable labels; add_labels to add additional value labels without replacing the existing ones.

Examples

if (require("sjmisc")) {
  dummy <- sample(1:4, 40, replace = TRUE)
  frq(dummy)

  dummy <- set_labels(dummy, labels = c("very low", "low", "mid", "hi"))
  frq(dummy)

  # assign labels with named vector
  dummy <- sample(1:4, 40, replace = TRUE)
  dummy <- set_labels(dummy, labels = c("very low" = 1, "very high" = 4))
  frq(dummy)

  # force using all labels, even if not all labels
  # have associated values in vector
  x <- c(2, 2, 3, 3, 2)
  # only two value labels
  x <- set_labels(x, labels = c("1", "2", "3"))
  x
  frq(x)

  # all three value labels
  x <- set_labels(x, labels = c("1", "2", "3"), force.labels = TRUE)
  x
  frq(x)

  # create vector
  x <- c(1, 2, 3, 2, 4, NA)
  # add less labels than values
  x <- set_labels(x, labels = c("yes", "maybe", "no"), force.values = FALSE)
  x
  # add all necessary labels
  x <- set_labels(x, labels = c("yes", "maybe", "no"), force.values = TRUE)
  x

  # set labels and missings
  x <- c(1, 1, 1, 2, 2, -2, 3, 3, 3, 3, 3, 9)
  x <- set_labels(x, labels = c("Refused", "One", "Two", "Three", "Missing"))
  x
  set_na(x, na = c(-2, 9))
}


if (require("haven") && require("sjmisc")) {
  x <- labelled(
    c(1:3, tagged_na("a", "c", "z"), 4:1),
    c("Agreement" = 1, "Disagreement" = 4, "First" = tagged_na("c"),
      "Refused" = tagged_na("a"), "Not home" = tagged_na("z"))
  )
  # get current NA values
  x
  get_na(x)
  # lose value labels from tagged NA by default, if not specified
  set_labels(x, labels = c("New Three" = 3))
  # do not drop na
  set_labels(x, labels = c("New Three" = 3), drop.na = FALSE)


  # set labels via named vector,
  # not using all possible values
  data(efc)
  get_labels(efc$e42dep)

  x <- set_labels(
    efc$e42dep,
    labels = c(`independent` = 1,
               `severe dependency` = 2,
               `missing value` = 9)
    )
  get_labels(x, values = "p")
  get_labels(x, values = "p", non.labelled = TRUE)

  # labels can also be set for tagged NA value
  # create numeric vector
  x <- c(1, 2, 3, 4)
  # set 2 and 3 as missing, which will automatically set as
  # tagged NA by 'set_na()'
  x <- set_na(x, na = c(2, 3))
  x
  # set label via named vector just for tagged NA(3)
  set_labels(x, labels = c(`New Value` = tagged_na("3")))

  # setting same value labels to multiple vectors
  dummies <- data.frame(
    dummy1 = sample(1:4, 40, replace = TRUE),
    dummy2 = sample(1:4, 40, replace = TRUE),
    dummy3 = sample(1:4, 40, replace = TRUE)
  )

  # and set same value labels for two of three variables
  test <- set_labels(
    dummies, dummy1, dummy2,
    labels = c("very low", "low", "mid", "hi")
  )
  # see result...
  get_labels(test)
}

# using quasi-quotation
if (require("rlang") && require("dplyr")) {
  dummies <- data.frame(
    dummy1 = sample(1:4, 40, replace = TRUE),
    dummy2 = sample(1:4, 40, replace = TRUE),
    dummy3 = sample(1:4, 40, replace = TRUE)
  )

  x1 <- "dummy1"
  x2 <- c("so low", "rather low", "mid", "very hi")

  dummies %>%
    val_labels(
      !!x1 := c("really low", "low", "a bit mid", "hi"),
      dummy3 = !!x2
    ) %>%
    get_labels()

  # ... and named vectors to explicitly set value labels
  x2 <- c("so low" = 4, "rather low" = 3, "mid" = 2, "very hi" = 1)
  dummies %>%
    val_labels(
      !!x1 := c("really low" = 1, "low" = 3, "a bit mid" = 2, "hi" = 4),
      dummy3 = !!x2
    ) %>% get_labels(values = "p")
}

sjlabelled documentation built on April 10, 2022, 5:05 p.m.