merge_samples: Merge Sampled Data based on IDs

merge_samplesR Documentation

Merge Sampled Data based on IDs

Description

In an object of class "Mutiwave", merge_samples creates a dataframe in the "data" slot of the specified wave by merging the dataframe in the "sampled data" slot with the dataframe in the "data" slot of the previous wave.

Usage

merge_samples(
  x,
  phase,
  wave,
  id = NULL,
  phase_sample_ind = "sampled_phase",
  wave_sample_ind = "sampled_wave",
  include_probs = NULL
)

Arguments

x

an object of class "Multiwave".

phase

A numeric value specifying the phase of the Multiwave object that the specified wave is in. Cannot be phase 1.

wave

A numeric value specifying the wave of the Multiwave object that the merge should be performed in. This wave must have a valid dataframe in the "sampled data" slot. The previous wave, taken as the final wave of the previous phase if wave = 1, must have a valid dataframe in the "data" slot.

id

A character value specifying the name of the column holding unit ids. Taken from wave, phase, or overall metadata (searched for in that order) if NULL. Defaults to NULL.

phase_sample_ind

a character value specifying the name of the column that should hold the indicator of whether each unit has already been sampled in the current phase. The specified phase number will be appended to the end of the given character name. Defaults to "sampled_phase".

wave_sample_ind

a character value specifying the name of the column that should hold the indicator of whether each unit has already been sampled in the current wave. The specified phase and wave numbers separated by "." will be appended o the end of the given character name. If FALSE, no such column is created. Defaults to "sampled_wave".

include_probs

A logical value. If TRUE, looks for "probs" in the design_data slot and includes the corresponding sampling probability for each element sampled in the current wave in the merged data in a column named "sampling_prob". If this column already exists, it keeps the existing column and adds (or replaces) the values for units sampled in the current wave. Returns an error if specified but wave_sample_wave is FALSE. Defaults to NULL, which looks for "probs" argument in metadata and does not create (or add to existing) "sampling_prob" column if none is found.

Details

Columns in "sampled_data" that do not match names of the "data" from the previous wave will be added as new columns in the output dataframe. All ids that do not appear in "sampled_data" will receive NA values for these new variables.

If a column name in the "sampled_data" matches a column name in the "data" slot of the previous wave, these columns will be merged into one column with the same name in the output dataframe. For ids that have non-missing values in both columns of the merge, the value from "sampled_data" will overwrite the previous value and a warning will be printed. All ids present in the "data" from the previous wave but missing from "sampled_data" will be given NA values for the newly merged variables.

If columns with the name produced by phase_sample_ind or wave_sample_ind already exist, they will be overwritten.

Value

A Multiwave object with the merged dataframe in the "data" slot of the specified wave.

Examples

library(datasets)
iris <- data.frame(iris, id = 1:150)

MySurvey <- multiwave(phases = 2, waves = c(1, 3))
set_mw(MySurvey, phase = 1, slot = "data") <-
  data.frame(dplyr::select(iris, -Sepal.Width))
set_mw(MySurvey, phase = 2, wave = 1, slot = "sampled_data") <-
  dplyr::select(iris, id, Sepal.Width)[1:40, ]
set_mw(MySurvey, phase = 2, wave = 1, slot = "samples") <-
   list(ids = 1:40)
MySurvey <- merge_samples(MySurvey, phase = 2, wave = 1, id = "id")

optimall documentation built on June 22, 2024, 9:34 a.m.