dictionary: Dictionary of Variable Attributes
In HMDA: Holistic Multimodel Domain Analysis for Exploratory Machine Learning

dictionary

R Documentation

Dictionary of Variable Attributes

Description

Extracts a specified attribute from each column of a data frame and returns a dictionary as a data frame mapping variable names to their corresponding attribute values.

Usage

dictionary(df, attribute = "label", na.rm = TRUE)

Arguments

`df`	A data frame whose columns may have attached attributes.
`attribute`	A character string specifying the name of the attribute to extract from each column (e.g., "label").
`na.rm`	Logical; if `TRUE`, rows for which the attribute is missing (`NA`) are omitted from the output. Default is `TRUE`.

Details

The function iterates over each column in the input data frame df and retrieves the specified attribute using attr(). If the attribute is not found for a column, NA is returned as its description. The resulting data frame acts as a dictionary for the variables, which is particularly useful for documenting datasets during exploratory data analysis.

Value

A data frame with two columns:

name: The names of the variables in df.
description: The extracted attribute values from each variable.

Author(s)

E. F. Haghish

Examples

  # Example: Generate a dictionary of variable labels using the USJudgeRatings dataset.
  # This dataset contains ratings on various performance measures for U.S. federal judges.
  data("USJudgeRatings")

  # Assume that the dataset's variables have been annotated with "label" attributes.
  # which is the default label read by dictionary
  attr(USJudgeRatings$CONT, "label") <- "Content Quality"
  attr(USJudgeRatings$INTG, "label") <- "Integrity"
  attr(USJudgeRatings$DMNR, "label") <- "Demeanor"
  attr(USJudgeRatings$DILG, "label") <- "Diligence"

  # Generate the dictionary of labels
  dict <- dictionary(USJudgeRatings, "label")
  print(dict)

HMDA documentation built on April 4, 2025, 6:06 a.m.