SurveyData: SurveyData objects

SurveyDataR Documentation

SurveyData objects

Description

An R6 SurveyData object represents a survey and its metadata. The survey itself is a data frame containing all data from the survey. The SurveyData object also includes the survey questions and responses (if left empty these will just be the column and factor level names). To enable weighted comparisons, survey weights and a survey design can be specified, with the survey design specified using survey package notation.

Methods

Public methods


Method new()

Create a new SurveyData object using an existing data frame and other survey information. This method is used to create the objects for both the sample and the population data. If a population is approximated from a large survey (like the ACS or DHS), then the package will enable the creation of a weighted poststratification matrix. If the population is summarized as a poststratification matrix already, then set the weights as the size in each cell $N_j$. If the entire individual level population data is given, then weights should be omitted and will be automatically set to 1.

Usage
SurveyData$new(
  data,
  questions = list(),
  responses = list(),
  weights = numeric(),
  design = list(ids = ~1)
)
Arguments
data

A data frame containing the survey data.

questions, responses

Named lists containing the text of the survey questions and the allowed responses, respectively. The names must correspond to the names of variables in data. If these aren't provided then they will be created internally using all factor, character, and binary variables in data. Responses can also be provided as a data frame with column names "data" and "asked", which reflect the responses as coded in the data ("data") and their corresponding actual survey responses ("asked"). See Examples.

weights

Optionally, the name of a variable in data containing survey weights.

design

Optionally, a named list of arguments (except weights and data) to pass to survey::svydesign() to specify the survey design.

Returns

A SurveyData object that can be used in the creation of a SurveyMap object.

Examples
# Example sample data
head(shape_survey)

# SurveyData object for sample data
box_prefs <- SurveyData$new(
  data = shape_survey,
  questions = list(
    age = "Please identify your age group",
    gender = "Please select your gender",
    vote_for = "Which party did you vote for in the 2018 election?",
    y = "If today is the election day, would you vote for the Box Party?"
  ),
  responses = list(
    age = levels(shape_survey$age),
    gender = levels(shape_survey$gender),
    # Here we use a data frame for the responses because the levels
    # in the data are abridged versions of the actual responses.
    # This can be useful when surveys have brief/non descriptive responses.
    vote_for = data.frame(
      data = levels(shape_survey$vote_for),
      asked = c("Box Party Faction A", "Box Party Faction B",
                "Circle Party Coalition", "Circle Party")
    ),
    y = c("no","yes")
  ),
  weights = "wt",
  design = list(ids =~1)
)
box_prefs$print()
box_prefs$n_questions()


# Example population data
head(approx_voters_popn)

# SurveyData object for population if estimated from large survey
popn_obj1 <- SurveyData$new(
  data = approx_voters_popn,
  questions = list(
    age_group = "Which age group are you?",
    gender = "Gender?"
  ),
  # order doesn't matter (gender before age here) because
  # the list has the names of the variables
  responses = list(
    gender = levels(approx_voters_popn$gender),
    age_group = levels(approx_voters_popn$age_group)
  ),
  weights = "wt" # use the wt column from approx_voters_popn data
)

# SurveyData object for population if poststratification matrix already known
library(dplyr)
popn_ps <- approx_voters_popn %>%
  group_by(age_group,gender) %>%
  summarise(N_j = sum(wt))

popn_obj2 <- SurveyData$new(
  data = popn_ps,
  questions = list(
    age_group = "Which age group are you?",
    gender = "Gender?"
  ),
  responses = list(
    gender = levels(popn_ps$gender),
    age_group = levels(popn_ps$age_group)
  ),
  weights = "N_j"# use N_j column from popn_ps data
)

# SurveyData object for population if individual population data known:
# (pretend that approx_voters_popn is the full population)
popn_obj3 <- SurveyData$new(
  data = approx_voters_popn,
  questions = list(
    age_group = "Which age group are you?",
    gender = "Gender?"
  ),
  responses = list(
    gender = levels(approx_voters_popn$gender),
    age_group = levels(approx_voters_popn$age_group)
  )
)
popn_obj1
popn_obj2
popn_obj3


Method n_obs()

Number of observations in the survey data

Usage
SurveyData$n_obs()
Returns

An integer.


Method n_questions()

Number of survey questions

Usage
SurveyData$n_questions()
Returns

An integer.


Method print()

Print a summary of the survey data

Usage
SurveyData$print(...)
Arguments
...

Currently ignored.

Returns

The SurveyData object, invisibly.


Method add_survey_data_column()

Add a column to the sample data. This is primarily intended for internal use but may occasionally be useful.

Usage
SurveyData$add_survey_data_column(name, value)
Arguments
name, value

The name of the new variable (a string) and the vector of values to add to the data frame.

Returns

The SurveyData object, invisibly.


Method survey_data()

Access the data frame containing the sample data.

Usage
SurveyData$survey_data(key = TRUE)
Arguments
key

Should the .key column be included? This column just indicates the original order of the rows and is primarily intended for internal use.

Returns

A data frame.


Method questions()

Access the list of survey questions

Usage
SurveyData$questions()
Returns

A named list.


Method responses()

Access the list of allowed survey responses

Usage
SurveyData$responses()
Returns

A named list.


Method weights()

Access the survey weights

Usage
SurveyData$weights()
Returns

A numeric vector.


Method design()

Access the survey design

Usage
SurveyData$design()

Method clone()

The objects of this class are cloneable with this method.

Usage
SurveyData$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.

Examples


## ------------------------------------------------
## Method `SurveyData$new`
## ------------------------------------------------

# Example sample data
head(shape_survey)

# SurveyData object for sample data
box_prefs <- SurveyData$new(
  data = shape_survey,
  questions = list(
    age = "Please identify your age group",
    gender = "Please select your gender",
    vote_for = "Which party did you vote for in the 2018 election?",
    y = "If today is the election day, would you vote for the Box Party?"
  ),
  responses = list(
    age = levels(shape_survey$age),
    gender = levels(shape_survey$gender),
    # Here we use a data frame for the responses because the levels
    # in the data are abridged versions of the actual responses.
    # This can be useful when surveys have brief/non descriptive responses.
    vote_for = data.frame(
      data = levels(shape_survey$vote_for),
      asked = c("Box Party Faction A", "Box Party Faction B",
                "Circle Party Coalition", "Circle Party")
    ),
    y = c("no","yes")
  ),
  weights = "wt",
  design = list(ids =~1)
)
box_prefs$print()
box_prefs$n_questions()


# Example population data
head(approx_voters_popn)

# SurveyData object for population if estimated from large survey
popn_obj1 <- SurveyData$new(
  data = approx_voters_popn,
  questions = list(
    age_group = "Which age group are you?",
    gender = "Gender?"
  ),
  # order doesn't matter (gender before age here) because
  # the list has the names of the variables
  responses = list(
    gender = levels(approx_voters_popn$gender),
    age_group = levels(approx_voters_popn$age_group)
  ),
  weights = "wt" # use the wt column from approx_voters_popn data
)

# SurveyData object for population if poststratification matrix already known
library(dplyr)
popn_ps <- approx_voters_popn %>%
  group_by(age_group,gender) %>%
  summarise(N_j = sum(wt))

popn_obj2 <- SurveyData$new(
  data = popn_ps,
  questions = list(
    age_group = "Which age group are you?",
    gender = "Gender?"
  ),
  responses = list(
    gender = levels(popn_ps$gender),
    age_group = levels(popn_ps$age_group)
  ),
  weights = "N_j"# use N_j column from popn_ps data
)

# SurveyData object for population if individual population data known:
# (pretend that approx_voters_popn is the full population)
popn_obj3 <- SurveyData$new(
  data = approx_voters_popn,
  questions = list(
    age_group = "Which age group are you?",
    gender = "Gender?"
  ),
  responses = list(
    gender = levels(approx_voters_popn$gender),
    age_group = levels(approx_voters_popn$age_group)
  )
)
popn_obj1
popn_obj2
popn_obj3


lauken13/mrpkit documentation built on Aug. 6, 2023, 3:42 a.m.