generate_dictionary: Create a data dictionary from labelled data

View source: R/generate_dictionary.R

generate_dictionaryR Documentation

Create a data dictionary from labelled data

Description

generate_dictionary() creates a data dictionary from a specified data frame. This function is particularly useful for understanding and documenting the structure of your dataset, similar to data dictionaries in Stata or SPSS.

Usage

generate_dictionary(data)

Arguments

data

a data frame or a survey object

Details

The function returns a tibble (a modern version of R's data frame) with the following columns:

  • position: An integer vector indicating the column position in the data frame.

  • variable: A character vector containing the names of the variables (columns).

  • description: A character vector with a human-readable description of each variable.

  • column type: A character vector specifying the data type (e.g., numeric, character) of each variable.

  • missing: An integer vector indicating the count of missing values for each variable.

  • levels: A list vector containing the levels for categorical variables, if applicable.

Value

A tibble representing the data dictionary. Each row corresponds to a variable in the original data frame, providing detailed information about the variable's characteristics.

Examples


# Creating a data dictionary from an SPSS file

file_path <- system.file("extdata", "Wages.sav", package = "bulkreadr")

wage_data <- read_spss_data(file = file_path)

generate_dictionary(wage_data)



bulkreadr documentation built on May 29, 2024, 1:35 a.m.