clean_spss_data: Clean SPSS Data Imported via 'haven'

View source: R/clean_spss_data.R

clean_spss_dataR Documentation

Clean SPSS Data Imported via haven

Description

This function cleans SPSS .sav data imported using haven, converting labelled variables into factors and cleaning variable names. It also creates a data dictionary and a list of factor variables with their levels and labels.

Usage

clean_spss_data(data, method = c("manual", "forcats"))

Arguments

data

A data frame or tibble imported with haven::read_sav().

method

A string specifying how to convert labelled variables to factors. Options are "manual" (uses factor() with labels attributes) or "forcats" (uses forcats::as_factor() with levels = "labels"). Default is "manual".

Value

A list with the following components:

data

A cleaned data frame with labelled variables converted to factors.

dictionary

A tibble showing variable names and labels.

factor_vars

A named list of variables with labels, showing levels and names.

Examples

## Not run: 
# Raw_data can be downloaded from here and saved to disc
# https://www.cambridge.org/us/academic/subjects/psychology/psychology-research-methods-and-statistics/statistics-using-ibm-spss-integrative-approach-3rd-edition?format=PB&isbn=9781107461222
raw_data <- haven::read_spss(file = "/Users/latour/Dropbox/code/R/qualtrics/Wages.sav")
cleaned <- clean_spss_data(raw_data, method = "forcats")
str(cleaned$data)

# Apply labels to the data
labels_list <- stats::setNames(object = as.list(cleaned$dictionary$lbl),
                               nm = cleaned$dictionary$var)

labelled::var_label(cleaned$data) <- labels_list

# See that now there are attributes where the labels are stored.
str(cleaned$data)



## End(Not run)


emilelatour/lamisc documentation built on July 4, 2025, 6:33 p.m.