dictionary: Dictionary of Variable Attributes

View source: R/dictionary.R

dictionaryR Documentation

Dictionary of Variable Attributes

Description

Extracts a specified attribute from each column of a data frame and returns a dictionary as a data frame mapping variable names to their corresponding attribute values.

Usage

dictionary(df, attribute = "label", na.rm = TRUE)

Arguments

df

A data frame whose columns may have attached attributes.

attribute

A character string specifying the name of the attribute to extract from each column (e.g., "label").

na.rm

Logical; if TRUE, rows for which the attribute is missing (NA) are omitted from the output. Default is TRUE.

Details

The function iterates over each column in the input data frame df and retrieves the specified attribute using attr(). If the attribute is not found for a column, NA is returned as its description. The resulting data frame acts as a dictionary for the variables, which is particularly useful for documenting datasets during exploratory data analysis.

Value

A data frame with two columns:

name

The names of the variables in df.

description

The extracted attribute values from each variable.

Author(s)

E. F. Haghish

Examples

  # Example: Generate a dictionary of variable labels using the USJudgeRatings dataset.
  # This dataset contains ratings on various performance measures for U.S. federal judges.
  data("USJudgeRatings")

  # Assume that the dataset's variables have been annotated with "label" attributes.
  # which is the default label read by dictionary
  attr(USJudgeRatings$CONT, "label") <- "Content Quality"
  attr(USJudgeRatings$INTG, "label") <- "Integrity"
  attr(USJudgeRatings$DMNR, "label") <- "Demeanor"
  attr(USJudgeRatings$DILG, "label") <- "Diligence"

  # Generate the dictionary of labels
  dict <- dictionary(USJudgeRatings, "label")
  print(dict)


HMDA documentation built on April 4, 2025, 6:06 a.m.