new_lama_dictionary: Create a new lama_dictionary class object
In labelmachine: Make Labeling of R Data Sets Easy

Description Usage Arguments Value Translations Missing values lama_dictionary class objects See Also Examples

Generates an S3 class object, which holds the variable translations. There are three valid ways to use new_lama_dictionary in order to create a lama_dictionary class object:

No arguments were passed into ...: In this case new_lama_dictionary returns an empty lama_dictionary class object (e.g. dict <- new_lama_dictionary()).
The first argument is a list: In this case only the first argument of new_lama_dictionary is used. It is not necessary to pass in a named argument. The passed in object must be a named list object, which contains all translations that should be added to the new lama_dictionary class object. Each item of the named list object must be a named character vector defining a translation (e.g. new_lama_dictionary(list(area = c("0" = "urban", "1" = "rural"), = c(l = "Low", h = "High"))) generates a lama_dictionary class object holding the translations "area" and "density").
The first argument is a character vector: In this case, it is allowed to pass in more than one argument. In this case, all given arguments must be named arguments holding named character vectors defining translations (e.g. new_lama_dictionary(area = c("0" = "urban", "1" = "rural"), density = c(l = "Low", h = "High")) generates a lama_dictionary class object holding the translations "area" and "density"). The names of the passed in arguments will be used as the names, under which the given translations will be added to the new lama_dictionary class object.

new_lama_dictionary(...)

## S3 method for class 'list'
new_lama_dictionary(.data = NULL, ...)

## S3 method for class 'character'
new_lama_dictionary(...)

## Default S3 method:
new_lama_dictionary(...)

...

None, one or more named/unnamed arguments. Depending on the type of the type of the first argument passed into new_lama_dictionary, there are different valid ways of using new_lama_dictionary:

No arguments were passed into ...: In this case new_lama_dictionary returns an empty lama_dictionary class object (e.g. dict <- new_lama_dictionary()).
The first argument is a list: In this case, only the first argument of new_lama_dictionary is used and it is allowed to use an unnamed argument call. Furthermore, the passed in object must be a named list object, which contains all translations that should be added to the new lama_dictionary class object. Each item of the named list object must be a named character vector defining a translation (e.g. new_lama_dictionary(list(area = c("0" = "urban", "1" = "rural"), = c(l = "Low", h = "High"))) generates a lama_dictionary class object holding the translations "area" and "density").
The first argument is a character vector: In this case, it is allowed to pass in more than one argument, but all given arguments when calling new_directory must be named arguments and each argument must be a named character vectors defining translations (e.g. new_lama_dictionary(area = c("0" = "urban", "1" = "rural"), density = c(l = "Low", h = "High")) generates a lama_dictionary class object holding the translations "area" and "density"). The names of the caller arguments will be used as names under which the given translations will be added to the new lama_dictionary class object.

.data

A named list object, where each list entry corresponds to a translation that should be added to the lama_dictionary object (e.g. new_lama_dictionary(list(area = c("0" = "urban", "1" = "rural"), = c(l = "Low", h = "High"))) generates a lama_dictionary class object holding the translations "area" and "density"). The names of the list entries are the names under which the translation will be added to the new lama_dictionary class object (e.g. area and density). Each list entry must be a named character vector defining a translation (e.g. c("0" = "urban", "1" = "rural") is the translation with the name area and c(l = "Low", h = "High") is the translation with the name density).

A new lama_dictionary class object holding the passed in translations.

A translation is a named character vector of non zero length. This named character vector defines which labels (of type character) should be assigned to which values (can be of type character, logical or numeric) (e.g. the translation c("0" = "urban", "1" = "rural") assigns the label "urban" to the value 0 and "rural" to the value 1, for example the variable x = c(0, 0, 1) is translated to x_new = c("urban", "urban", "rural")). Therefore, a translation (named character vector) contains the following information:

The names of the character vector entries correspond to the original variable levels. Variables of types numeric or logical are turned automatically into a character vector (e.g. 0 and 1 are treated like "0" and "1").
The entries (character strings) of the character vector correspond to the new labels, which will be assigned to the original variable levels. It is also allowed to have missing labels (NAs). In this case, the original values are mapped onto missing values.

The function lama_translate() is used in order to apply a translation on a variable. The resulting vector with the assigned labels can be of the following types:

character: An unordered vector holding the new character labels.
factor with character levels: An ordered vector holding the new character labels.

The original variable can be of the following types:

character vector: This is the simplest case. The character values will replaced by the corresponding labels.
numeric or logical vector: Vectors of type numeric or logical will be turned into character vectors automatically before the translation process and then simply processed like in the character case. Therefore, it is sufficient to define the translation mapping for the character case, since it also covers the numeric and logical case.
factor vector with levels of any type: When translating factor variables one can decide whether or not to keep the original ordering. Like in the other cases the levels of the factor variable will always be turned into character strings before the translation process.

It is also possible to handle missing values with lama_translate(). Therefore, the used translation must contain a information that tells how to handle a missing value. In order to define such a translation the missing value (NA) can be escaped with the character string "NA_". This can be useful in two situations:

All missing values should be labeled (e.g. the translation c("0" = "urban", "1" = "rural", NA_ = "missing") assigns the character string "missing" to all missing values of a variable).
Map some original values to NA (e.g. the translation c("0" = "urban", "1" = "rural", "2" = "NA_", "3" = "NA_") assigns NA (the missing character) to the original values 2 and 3). Actually, in this case the translation definition does not always have to use this escape mechanism, but only when defining the translations inside of a YAML file, since the YAML parser does not recognize missing values.

Each lama_dictionary class object can contain multiple translations, each with a unique name under which the translation can be found. The function lama_translate() uses a lama_dictionary class object to translate a normal vector or to translate one or more columns in a data.frame. Sometimes it may be necessary to have different translations for the same variable, in this case it is best to have multiple translations with different names (e.g. area_short = c("0" = "urb", "1" = "rur") and area = c("0" = "urban", "1" = "rural")).

is.lama_dictionary(), as.lama_dictionary(), lama_translate(), lama_to_factor(), lama_translate_all(), lama_to_factor_all(), lama_read(), lama_write(), lama_select(), lama_rename(), lama_mutate(), lama_merge()

  ## Example-1: Initialize a lama-dictionary from a list object
  ##            holding the translations
  dict <- new_lama_dictionary(list(
    country = c(uk = "United Kingdom", fr = "France", NA_ = "other countries"),
    language = c(en = "English", fr = "French")
  ))
  dict

  ## Example-2: Initialize the lama-dictionary directly
  ##            by assigning each translation to a name
  dict <- new_lama_dictionary(
    country = c(uk = "United Kingdom", fr = "France", NA_ = "other countries"),
    language = c(en = "English", fr = "French")
  )
  dict