SurveyData | R Documentation |
An R6 SurveyData
object represents a survey and its
metadata. The survey itself is a data frame containing all data from the
survey. The SurveyData
object also includes the survey questions and
responses (if left empty these will just be the column and factor level
names). To enable weighted comparisons, survey weights and a survey design
can be specified, with the survey design specified using survey package
notation.
new()
Create a new SurveyData
object using an existing data
frame and other survey information. This method is used to create the
objects for both the sample and the population data. If a population is
approximated from a large survey (like the ACS or DHS), then the
package will enable the creation of a weighted poststratification
matrix. If the population is summarized as a poststratification matrix
already, then set the weights as the size in each cell $N_j$. If the
entire individual level population data is given, then weights should
be omitted and will be automatically set to 1.
SurveyData$new( data, questions = list(), responses = list(), weights = numeric(), design = list(ids = ~1) )
data
A data frame containing the survey data.
questions, responses
Named lists containing the text of the survey
questions and the allowed responses, respectively. The names must
correspond to the names of variables in data
. If these aren't
provided then they will be created internally using all factor,
character, and binary variables in data
. Responses can also be
provided as a data frame with column names "data"
and "asked"
,
which reflect the responses as coded in the data ("data"
) and their
corresponding actual survey responses ("asked"
). See Examples.
weights
Optionally, the name of a variable in data
containing
survey weights.
design
Optionally, a named list of arguments (except weights
and
data
) to pass to survey::svydesign()
to specify the survey design.
A SurveyData
object that can be used in the creation of a
SurveyMap
object.
# Example sample data head(shape_survey) # SurveyData object for sample data box_prefs <- SurveyData$new( data = shape_survey, questions = list( age = "Please identify your age group", gender = "Please select your gender", vote_for = "Which party did you vote for in the 2018 election?", y = "If today is the election day, would you vote for the Box Party?" ), responses = list( age = levels(shape_survey$age), gender = levels(shape_survey$gender), # Here we use a data frame for the responses because the levels # in the data are abridged versions of the actual responses. # This can be useful when surveys have brief/non descriptive responses. vote_for = data.frame( data = levels(shape_survey$vote_for), asked = c("Box Party Faction A", "Box Party Faction B", "Circle Party Coalition", "Circle Party") ), y = c("no","yes") ), weights = "wt", design = list(ids =~1) ) box_prefs$print() box_prefs$n_questions() # Example population data head(approx_voters_popn) # SurveyData object for population if estimated from large survey popn_obj1 <- SurveyData$new( data = approx_voters_popn, questions = list( age_group = "Which age group are you?", gender = "Gender?" ), # order doesn't matter (gender before age here) because # the list has the names of the variables responses = list( gender = levels(approx_voters_popn$gender), age_group = levels(approx_voters_popn$age_group) ), weights = "wt" # use the wt column from approx_voters_popn data ) # SurveyData object for population if poststratification matrix already known library(dplyr) popn_ps <- approx_voters_popn %>% group_by(age_group,gender) %>% summarise(N_j = sum(wt)) popn_obj2 <- SurveyData$new( data = popn_ps, questions = list( age_group = "Which age group are you?", gender = "Gender?" ), responses = list( gender = levels(popn_ps$gender), age_group = levels(popn_ps$age_group) ), weights = "N_j"# use N_j column from popn_ps data ) # SurveyData object for population if individual population data known: # (pretend that approx_voters_popn is the full population) popn_obj3 <- SurveyData$new( data = approx_voters_popn, questions = list( age_group = "Which age group are you?", gender = "Gender?" ), responses = list( gender = levels(approx_voters_popn$gender), age_group = levels(approx_voters_popn$age_group) ) ) popn_obj1 popn_obj2 popn_obj3
n_obs()
Number of observations in the survey data
SurveyData$n_obs()
An integer.
n_questions()
Number of survey questions
SurveyData$n_questions()
An integer.
print()
Print a summary of the survey data
SurveyData$print(...)
...
Currently ignored.
The SurveyData
object, invisibly.
add_survey_data_column()
Add a column to the sample data. This is primarily intended for internal use but may occasionally be useful.
SurveyData$add_survey_data_column(name, value)
name, value
The name of the new variable (a string) and the vector of values to add to the data frame.
The SurveyData
object, invisibly.
survey_data()
Access the data frame containing the sample data.
SurveyData$survey_data(key = TRUE)
key
Should the .key
column be included? This column just
indicates the original order of the rows and is primarily intended
for internal use.
A data frame.
questions()
Access the list of survey questions
SurveyData$questions()
A named list.
responses()
Access the list of allowed survey responses
SurveyData$responses()
A named list.
weights()
Access the survey weights
SurveyData$weights()
A numeric vector.
design()
Access the survey design
SurveyData$design()
clone()
The objects of this class are cloneable with this method.
SurveyData$clone(deep = FALSE)
deep
Whether to make a deep clone.
## ------------------------------------------------
## Method `SurveyData$new`
## ------------------------------------------------
# Example sample data
head(shape_survey)
# SurveyData object for sample data
box_prefs <- SurveyData$new(
data = shape_survey,
questions = list(
age = "Please identify your age group",
gender = "Please select your gender",
vote_for = "Which party did you vote for in the 2018 election?",
y = "If today is the election day, would you vote for the Box Party?"
),
responses = list(
age = levels(shape_survey$age),
gender = levels(shape_survey$gender),
# Here we use a data frame for the responses because the levels
# in the data are abridged versions of the actual responses.
# This can be useful when surveys have brief/non descriptive responses.
vote_for = data.frame(
data = levels(shape_survey$vote_for),
asked = c("Box Party Faction A", "Box Party Faction B",
"Circle Party Coalition", "Circle Party")
),
y = c("no","yes")
),
weights = "wt",
design = list(ids =~1)
)
box_prefs$print()
box_prefs$n_questions()
# Example population data
head(approx_voters_popn)
# SurveyData object for population if estimated from large survey
popn_obj1 <- SurveyData$new(
data = approx_voters_popn,
questions = list(
age_group = "Which age group are you?",
gender = "Gender?"
),
# order doesn't matter (gender before age here) because
# the list has the names of the variables
responses = list(
gender = levels(approx_voters_popn$gender),
age_group = levels(approx_voters_popn$age_group)
),
weights = "wt" # use the wt column from approx_voters_popn data
)
# SurveyData object for population if poststratification matrix already known
library(dplyr)
popn_ps <- approx_voters_popn %>%
group_by(age_group,gender) %>%
summarise(N_j = sum(wt))
popn_obj2 <- SurveyData$new(
data = popn_ps,
questions = list(
age_group = "Which age group are you?",
gender = "Gender?"
),
responses = list(
gender = levels(popn_ps$gender),
age_group = levels(popn_ps$age_group)
),
weights = "N_j"# use N_j column from popn_ps data
)
# SurveyData object for population if individual population data known:
# (pretend that approx_voters_popn is the full population)
popn_obj3 <- SurveyData$new(
data = approx_voters_popn,
questions = list(
age_group = "Which age group are you?",
gender = "Gender?"
),
responses = list(
gender = levels(approx_voters_popn$gender),
age_group = levels(approx_voters_popn$age_group)
)
)
popn_obj1
popn_obj2
popn_obj3
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.