The goal of retroharmonize is to facilitate retrospective (ex-post) harmonization of data, particularly survey data, in a reproducible manner. The package provides tools for organizing the metadata, standardizing the coding of variables, variable names and value labels, including missing values, and for documenting all transformations, with the help of comprehensive S3 classes.

import functions

Read data stored in formats with rich metadata, such as SPSS (.sav) files, and make them usable in a programmatic context.
read_spss: read an SPSS file and record metadata for reproducibility
read_rds: read an rds file and record metadata for reproducibility
read_surveys: programmatically read a list of surveys
pull_survey: pull a single survey from a survey list.

subsetting functions

subset_surveys: remove variables from surveys that cannot be harmonized.

variable name harmonization functions

harmonize_survey_variables: Create a list of surveys with harmonized variable names.

codebook functions

codebook_create: A not yet working function.

codelist functions

codelist_create: A not yet working function.

variable label harmonization functions

Create consistent coding and labelling.
harmonize_values: Harmonize the label list across surveys.
harmonize_survey_values: Create a list of surveys with harmonized value labels.
na_range_to_values: Make the na_range attributes, as imported from SPSS, consistent with the na_values attributes.
label_normalize removes special characters, whitespace, and other typical typing errors and helps the uniformization of labels and variable names.

survey harmonization functions

merge_surveys: Create a list of surveys with harmonized names and variable labels.
crosswalk_surveys: Create a list of surveys with harmonized variable names, harmonized value labels and harmonize R classes.
crosswalk: Create a joined data frame of surveys with harmonized variable names, harmonized value labels and harmonize R classes.

metadata functions

metadata_create: Createa metadata dataa from one or more survey.
metadata_survey_create: Create a joined metadata data frame from one survey.
create_codebook and codebook_waves_create crosswalk_table_create: Create an initial crosswalk table from a metadata data frame.

documentation functions

Make the workflow reproducible by recording the harmonization process. document_survey_item: Returns a list of the current and historic coding, labelling of the valid range and missing values or range, the history of the variable names and the history of the survey IDs. document_surveys: Document the key attributes surveys in a survey list.

type conversion functions

Consistently treat labels and SPSS-style user-defined missing values in the R language. survey helps constructing a valid survey data frame, and labelled_spss_survey helps creating a vector for a questionnaire item. as_numeric: convert to numeric values.
as_factor: convert to labels to factor levels.
as_character: convert to labels to characters.
as_labelled_spss_survey: convert labelled and labelled_spss vectors to labelled_spss_survey vectors.


