subset_surveys: Subset surveys

subset_surveysR Documentation

Subset surveys

Description

This is a wrapper function for various procedures to reduce the size of surveys by removing variables that are not harmonized.

Usage

subset_surveys(
  survey_list,
  survey_paths = NULL,
  rowid = "rowid",
  subset_name = "subset",
  subset_vars = NULL,
  crosswalk_table = NULL,
  import_path = NULL,
  export_path = NULL
)

subset_waves(waves, subset_vars = NULL)

subset_save_surveys(
  crosswalk_table,
  subset_name = "subset",
  survey_list = NULL,
  subset_vars = NULL,
  survey_paths = NULL,
  import_path = NULL,
  export_path = NULL
)

Arguments

survey_list

A list of surveys imported with read_surveys. If set to NULL, the survey_path should give full path to the surveys.

survey_paths

A vector of full file paths to the surveys to subset.

rowid

The unique row (observation) identifier in the files. Defaults to "rowid", which is the default of the importing functions in this package.

subset_name

An identifier for the survey subset.

subset_vars

The names of the variables that should be kept from all surveys in the list that contains the wave of surveys. Defaults to NULL in which case it returns all variables without subsetting.

crosswalk_table

A crosswalk table created by crosswalk_table_create or a manually created crosstable including at least filename, var_name_orig, var_name_target and optionally var_label_orig and var_label_target. This parameter is optional and defaults to NULL.

waves

A list of surveys imported with read_surveys.

Details

This function allows several workflows. Subsetting can be based on a vector of variable names given by survey_path, or on the basis of a crosstable. The subset_save_surveys can be called directly.

subset_surveys will also harmonize the variable names if the var_name_target is optionally defined in the crosswalk_table input. harmonize_survey_variables is a wrapper and will require that the new (target) variable names are present in a valid crosstable.

Value

A list of surveys or save individual rds files on the export_path.

Examples

examples_dir <- system.file("examples", package = "retroharmonize")
survey_list <- dir(examples_dir)[grepl("\\.rds", dir(examples_dir))]

example_surveys <- read_surveys(
  file.path( examples_dir, survey_list)
  )
  
subset_surveys(survey_list = example_surveys, 
               subset_vars = c("rowid", "isocntry", "qa10_1", "qa14_1"), 
               subset_name = "subset_example")

antaldaniel/retroharmonize documentation built on Dec. 31, 2024, 9:52 p.m.