SurveyMap: SurveyMap

SurveyMapR Documentation

SurveyMap

Description

An R6 SurveyMap object holds the mapping between a set of items in a survey and a population dataset. The label is the item label in each dataset and the values is a list of all possible values. The values for the survey and population must be aligned, i.e., the lists must have the same number of elements and the values at index i in each list are equivalent. If there is a meaningful ordering over the values, they should be listed in that order, either descending or ascending.

Methods

Public methods


Method new()

Create a new SurveyMap object

Usage
SurveyMap$new(sample, population, ...)
Arguments
sample

The SurveyData object corresponding to the sample data.

population

The SurveyData object corresponding to the population data.

...

QuestionMap objects.

Returns

A SurveyMap object.


Method print()

Print a summary of the mapping.

Usage
SurveyMap$print(...)
Arguments
...

Currently ignored.

Returns

The SurveyMap object, invisibly.


Method add()

Add new QuestionMaps.

Usage
SurveyMap$add(...)
Arguments
...

The QuestionMaps to add.

Returns

The SurveyMap object, invisibly.


Method delete()

Delete QuestionMaps.

Usage
SurveyMap$delete(...)
Arguments
...

The QuestionMaps to delete.

Returns

The SurveyMap object, invisibly.


Method replace()

Replace one QuestionMap with another.

Usage
SurveyMap$replace(old_question, new_question)
Arguments
old_question

The QuestionMap object to replace.

new_question

The QuestionMap object to use instead.

Returns

The SurveyMap object, invisibly.


Method validate()

Validate the mapping.

Usage
SurveyMap$validate()
Returns

The SurveyMap object, invisibly.


Method mapping()

The mapping method uses the given maps between questions to create new sample and population data frames that have unified variable names (e.g., if the underlying construct is called age, both sample and population will have an age column, even if in the the raw data both had different variable names).

This method also unifies the levels of each variable in the sample and population so that the maximum set of consistent levels is created. Names of these new levels are made according the the sample level names. If multiple levels are combined, the new name will be the existing levels separated by a + (e.g. if age groups "18-25" and "26-30" are combined the new name will be "18-25 + 26-30").

Use the mapped_sample_data and mapped_population_data methods to access the resulting data frames.

Usage
SurveyMap$mapping()
Returns

The SurveyMap object, invisibly.


Method tabulate()

Prepare the poststratification table. The resulting data frame is available via the poststrat_data method. See Examples.

Usage
SurveyMap$tabulate(...)
Arguments
...

The names of the variables to include as strings.

Returns

The SurveyMap object, invisibly.


Method fit()

Fit a model. rstanarm, brms, and lme4 are supported natively. Custom modeling functions can also be used if they meet certain requirements.

Usage
SurveyMap$fit(fun, formula, ...)
Arguments
fun

The model fitting function to use. For example, fun=rstanarm::stan_glmer, fun=brms::brm, fun=lme4::glmer. If using a custom fun it must have a formula argument and a data argument that accepts a data frame (like standard R modeling functions). Other arguments can be passed via .... The formula argument will be taken from the formula argument below and the data argument will be automatically set to the the mapped data created by the mapping method (you can access this data via the mapped_sample_data method).

formula

The model formula. Can be either a string or a formula object.

...

Arguments other than formula and data to pass to fun.

Returns

A SurveyFit object.


Method item_map()

Access the item_map.

Usage
SurveyMap$item_map()
Returns

A named list of QuestionMaps.


Method sample()

Access the SurveyData object containing the sample data.

Usage
SurveyMap$sample()
Returns

A SurveyData object.


Method population()

Access the SurveyData object containing the population data.

Usage
SurveyMap$population()
Returns

A SurveyData object.


Method poststrat_data()

Access the poststratification data frame created by the tabulate method.

Usage
SurveyMap$poststrat_data()
Returns

A data frame.


Method mapped_sample_data()

Access the data frame containing the mapped sample data created by the mapping method.

Usage
SurveyMap$mapped_sample_data(key = TRUE)
Arguments
key

Should the .key column be included? This column just indicates the original order of the rows and is primarily intended for internal use.

Returns

A data frame.


Method mapped_population_data()

Access the data frame containing the mapped population data created by the mapping method

Usage
SurveyMap$mapped_population_data(key = TRUE)
Arguments
key

Should the .key column be included? This column just indicates the original order of the rows and is primarily intended for internal use.

Returns

A data frame.


Method clone()

The objects of this class are cloneable with this method.

Usage
SurveyMap$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.

Examples


# Some fake survey data for demonstration
head(shape_survey)

# Create SurveyData object for the sample
box_prefs <- SurveyData$new(
  data = shape_survey,
  questions = list(
    age = "Please identify your age group",
    gender = "Please select your gender",
    vote_for = "Which party did you vote for in the 2018 election?",
    y = "If today is the election day, would you vote for the Box Party?"
  ),
  responses = list(
    age = levels(shape_survey$age),
    gender = levels(shape_survey$gender),
    # Here we use a data frame for the responses because the levels
    # in the data are abridged versions of the actual responses.
    # This can be useful when surveys have brief/non descriptive responses.
    vote_for = data.frame(
      data = levels(shape_survey$vote_for),
      asked = c("Box Party Faction A", "Box Party Faction B",
                "Circle Party Coalition", "Circle Party")
    ),
    y = c("no", "yes")
  ),
  weights = "wt",
  design = list(ids =~1)
)
box_prefs$print()
box_prefs$n_questions()


# Some fake population data for demonstration
head(approx_voters_popn)

# Create SurveyData object for the population
popn_obj <- SurveyData$new(
  data = approx_voters_popn,
  questions = list(
    age_group = "Which age group are you?",
    gender = "Gender?",
    vote_pref = "Which party do you prefer to vote for?"
  ),
  # order doesn't matter (gender before age here) because
  # the list has the names of the variables
  responses = list(
    gender = levels(approx_voters_popn$gender),
    age_group = levels(approx_voters_popn$age_group),
    vote_pref = levels(approx_voters_popn$vote_pref)
  ),
  weights = "wt"
)
popn_obj$print()


# Create the QuestionMap objects mapping each question between the
# survey and population dataset
q_age <- QuestionMap$new(
  name = "age",
  col_names = c("age","age_group"),
  values_map = list(
    "18-25" = "18-35", "26-35" = "18-35","36-45" = "36-55",
    "46-55" = "36-55", "56-65" = "56-65", "66-75" = "66+", "76-90" = "66+"
  )
)
print(q_age)

q_party_pref <- QuestionMap$new(
  name = "party_pref",
  col_names = c("vote_for","vote_pref"),
  values_map = list("Box Party" = "BP",  "BP" = "BP","Circle Party" = "CP", "CP" = "CP")
)
q_gender <- QuestionMap$new(
  name = "gender",
  col_names = c("gender", "gender"),
  values_map = list("male" = "m","female" = "f", "nonbinary" = "nb")
)


# Create SurveyMap object adding all questions at once
ex_map <- SurveyMap$new(
  sample = box_prefs,
  population = popn_obj,
  q_age,
  q_party_pref,
  q_gender
)
print(ex_map) # or ex_map$print()

# Or can add questions incrementally
ex_map <- SurveyMap$new(sample = box_prefs, population = popn_obj)
print(ex_map)

ex_map$add(q_age, q_party_pref)
print(ex_map)

ex_map$add(q_gender)
print(ex_map)


# Create the mapping between sample and population
ex_map$mapping()

# Create the poststratification data frame using all variables in the mapping
# (alternatively, can specify particular variables, e.g. tabulate("age"))
ex_map$tabulate()

# Take a peak at the poststrat data frame
head(ex_map$poststrat_data())

## Not run: 
# Fit regression model using rstanarm (returns a SurveyFit object)
fit_1 <- ex_map$fit(
  fun = rstanarm::stan_glmer,
  formula = y ~ (1|age) + (1|gender),
  family = "binomial",
  seed = 1111,
  chains = 1, # just to keep the example fast and small
  refresh = 0 # suppress printed sampling iteration updates
)

# To use lme4 or brms instead of rstanarm you would use:
# Example lme4 usage
# fit_2 <- ex_map$fit(
#   fun = lme4::glmer,
#   formula = y ~ (1|age) + (1|gender),
#   family = "binomial"
# )

# Example brms usage
# fit_3 <- ex_map$fit(
#   fun = brms::brm,
#   formula = y ~ (1|age) + (1|gender),
#   family = "bernoulli",
#   seed = 1111
# )


# Predicted probabilities
# returns matrix with rows for poststrat cells, cols for posterior draws
poststrat_estimates <- fit_1$population_predict()

# Compute and summarize estimates by age level and party preference
estimates_by_age <- fit_1$aggregate(poststrat_estimates, by = "age")
estimates_by_party <- fit_1$aggregate(poststrat_estimates, by = "party_pref")

fit_1$summary(estimates_by_age)
fit_1$summary(estimates_by_party)

# Plot estimates
fit_1$plot(estimates_by_party)

fit_1$plot(estimates_by_age)

fit_1$plot(estimates_by_age, additional_stats = "none")
fit_1$plot(estimates_by_age, additional_stats = "wtd")
fit_1$plot(estimates_by_age, additional_stats = "raw")
fit_1$plot(estimates_by_age, additional_stats = c("wtd","raw","mrp"))

# Compute and summarize the population estimate
estimates_popn <- fit_1$aggregate(poststrat_estimates)
fit_1$summary(estimates_popn)

# Plot population estimate
fit_1$plot(estimates_popn)
fit_1$plot(estimates_popn, additional_stats = "none")
fit_1$plot(estimates_popn, additional_stats = "wtd")
fit_1$plot(estimates_popn, additional_stats = "raw")
fit_1$plot(estimates_popn, additional_stats = c("wtd","raw","mrp"))

## End(Not run)


lauken13/mrpkit documentation built on Aug. 6, 2023, 3:42 a.m.