SurveyFit: SurveyFit
In lauken13/mrpkit: Multilevel Regression with Post-Stratification

SurveyFit

R Documentation

SurveyFit

Description

An R6 SurveyFit object stores a fitted model object and provides methods for generating predicted probabilities for all poststratification cells, generating population and group estimates, and visualizing results.

Methods

Method `new()`

Create a new SurveyFit object. This method is called internally by the fit method of the SurveyMap object and does not need to be called directly by the user.

Usage

SurveyFit$new(fit, map, formula)

Arguments

fit: A fitted model object.
map: A SurveyMap object.
formula: A formula object for the model that was fit.

Returns

A SurveyFit object.

Method `fit()`

Access the fitted model object.

Usage

SurveyFit$fit()

Returns

The fitted model object created by the modeling function called by the fit method of the SurveyMap object. For example, if using rstanarm::stan_glmer() then a stanreg object from rstanarm is returned.

Method `map()`

Access the SurveyMap object.

Usage

SurveyFit$map()

Returns

The SurveyMap associated with the SurveyFit object.

Method `formula()`

Access the model formula.

Usage

SurveyFit$formula()

Returns

The model formula used when fitting the model.

Method `print()`

Call the fitted model object's print method. The console output from this method depends on the model fitting function used.

Usage

SurveyFit$print(...)

Arguments

...: Optional arguments to pass the print method.

Returns

The SurveyFit object, invisibly.

Method `population_predict()`

Use fitted model to add predicted probabilities to post-stratification dataset.

Usage

SurveyFit$population_predict(..., fun = NULL)

Arguments

...

Arguments other than the fitted model object and poststratification data frame to pass to fun.

fun

The function to use to generate the predicted probabilities. This should only be specified if you used a model fitting function not natively supported by mrpkit. For models fit using rstanarm, brms, or lme4, fun is handled automatically. If fun is specified then:

the first argument should be the fitted model object
the second argument should be the poststratification data frame
it can take an arbitrary number of other arguments
the returned object should match the specifications in the 'Returns' section below in order to be compatible with subsequent methods

Returns

A matrix with rows corresponding to poststratification cells and columns corresponding to posterior samples (or approximate ones in the case of lme4 models).

Method `aggregate()`

Aggregate estimates to the population level or by level of a grouping variable.

Usage

SurveyFit$aggregate(poststrat_estimates, by = NULL)

Arguments

poststrat_estimates: The object returned by the population_predict method.
by: Optionally a string specifying a grouping variable. If specified the aggregation will happen by level of the named variable. If not specified population-level estimates will be computed.

Returns

A data frame. If by is not specified then the data frame will have number of rows equal to the number of posterior draws. If by is specified the data frame will have number of rows equal to the number of posterior draws times the number of levels of the by variable, and there will be an extra column indicating which level of the by variable each row corresponds to.

Method `summary()`

Creates a set of summary statistics of the mrp estimate, and corresponding weighted and raw data estimates

Usage

SurveyFit$summary(aggregated_estimates)

Arguments

aggregated_estimates: The data frame returned by the aggregate method.

Returns

A data frame that consists of a minimum three rows with the raw, MRP and weighted estimates, plus an estimate of standard error. If the aggregated estimates were specified with a by argument (indicating sub population or small area estimates), then produces a dataframe with number of rows equal to three times the number of small areas.

Method `plot()`

Visualize population or sub-population estimates.

When passed the data frame containing the posterior distribution of the population MRP estimate a density plot is generated. If visualizing sub-populations it generates a violin plot of the posterior distribution of the aggregated MRP estimates for each level of the grouping variable. The additional_stats argument controls which other information is overlaid on the plot.

Usage

SurveyFit$plot(aggregated_estimates, additional_stats = c("wtd", "raw"))

Arguments

aggregated_estimates: The data frame returned by the aggregate method.
additional_stats: A vector that specifies which of three additional stats ("wtd", "raw", "mrp", "none") should be overlaid on the plot. The default is to overlay intervals for the weighted and raw estimates on top of the density plot representing the MRP estimates. The weighted estimates are computed by passing the optional survey weights and design specified in the SurveyData to the survey package. The raw estimate is a direct mean and binomial sd of the binary responses. Uncertainty estimates for the additional_stats are included on violin plots but not on density plots. Intervals are 95% CI.

Returns

A ggplot object that is either a violin plot if showing small area level (sub-population) estimates, or a density plot if showing population estimates.

Method `clone()`

The objects of this class are cloneable with this method.

Usage

SurveyFit$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

Examples


# Some fake survey data for demonstration
head(shape_survey)

# Create SurveyData object for the sample
box_prefs <- SurveyData$new(
  data = shape_survey,
  questions = list(
    age = "Please identify your age group",
    gender = "Please select your gender",
    vote_for = "Which party did you vote for in the 2018 election?",
    y = "If today is the election day, would you vote for the Box Party?"
  ),
  responses = list(
    age = levels(shape_survey$age),
    gender = levels(shape_survey$gender),
    # Here we use a data frame for the responses because the levels
    # in the data are abridged versions of the actual responses.
    # This can be useful when surveys have brief/non descriptive responses.
    vote_for = data.frame(
      data = levels(shape_survey$vote_for),
      asked = c("Box Party Faction A", "Box Party Faction B",
                "Circle Party Coalition", "Circle Party")
    ),
    y = c("no", "yes")
  ),
  weights = "wt",
  design = list(ids =~1)
)
box_prefs$print()
box_prefs$n_questions()


# Some fake population data for demonstration
head(approx_voters_popn)

# Create SurveyData object for the population
popn_obj <- SurveyData$new(
  data = approx_voters_popn,
  questions = list(
    age_group = "Which age group are you?",
    gender = "Gender?",
    vote_pref = "Which party do you prefer to vote for?"
  ),
  # order doesn't matter (gender before age here) because
  # the list has the names of the variables
  responses = list(
    gender = levels(approx_voters_popn$gender),
    age_group = levels(approx_voters_popn$age_group),
    vote_pref = levels(approx_voters_popn$vote_pref)
  ),
  weights = "wt"
)
popn_obj$print()


# Create the QuestionMap objects mapping each question between the
# survey and population dataset
q_age <- QuestionMap$new(
  name = "age",
  col_names = c("age","age_group"),
  values_map = list(
    "18-25" = "18-35", "26-35" = "18-35","36-45" = "36-55",
    "46-55" = "36-55", "56-65" = "56-65", "66-75" = "66+", "76-90" = "66+"
  )
)
print(q_age)

q_party_pref <- QuestionMap$new(
  name = "party_pref",
  col_names = c("vote_for","vote_pref"),
  values_map = list("Box Party" = "BP",  "BP" = "BP","Circle Party" = "CP", "CP" = "CP")
)
q_gender <- QuestionMap$new(
  name = "gender",
  col_names = c("gender", "gender"),
  values_map = list("male" = "m","female" = "f", "nonbinary" = "nb")
)


# Create SurveyMap object adding all questions at once
ex_map <- SurveyMap$new(
  sample = box_prefs,
  population = popn_obj,
  q_age,
  q_party_pref,
  q_gender
)
print(ex_map) # or ex_map$print()

# Or can add questions incrementally
ex_map <- SurveyMap$new(sample = box_prefs, population = popn_obj)
print(ex_map)

ex_map$add(q_age, q_party_pref)
print(ex_map)

ex_map$add(q_gender)
print(ex_map)


# Create the mapping between sample and population
ex_map$mapping()

# Create the poststratification data frame using all variables in the mapping
# (alternatively, can specify particular variables, e.g. tabulate("age"))
ex_map$tabulate()

# Take a peak at the poststrat data frame
head(ex_map$poststrat_data())

## Not run: 
# Fit regression model using rstanarm (returns a SurveyFit object)
fit_1 <- ex_map$fit(
  fun = rstanarm::stan_glmer,
  formula = y ~ (1|age) + (1|gender),
  family = "binomial",
  seed = 1111,
  chains = 1, # just to keep the example fast and small
  refresh = 0 # suppress printed sampling iteration updates
)

# To use lme4 or brms instead of rstanarm you would use:
# Example lme4 usage
# fit_2 <- ex_map$fit(
#   fun = lme4::glmer,
#   formula = y ~ (1|age) + (1|gender),
#   family = "binomial"
# )

# Example brms usage
# fit_3 <- ex_map$fit(
#   fun = brms::brm,
#   formula = y ~ (1|age) + (1|gender),
#   family = "bernoulli",
#   seed = 1111
# )


# Predicted probabilities
# returns matrix with rows for poststrat cells, cols for posterior draws
poststrat_estimates <- fit_1$population_predict()

# Compute and summarize estimates by age level and party preference
estimates_by_age <- fit_1$aggregate(poststrat_estimates, by = "age")
estimates_by_party <- fit_1$aggregate(poststrat_estimates, by = "party_pref")

fit_1$summary(estimates_by_age)
fit_1$summary(estimates_by_party)

# Plot estimates
fit_1$plot(estimates_by_party)

fit_1$plot(estimates_by_age)

fit_1$plot(estimates_by_age, additional_stats = "none")
fit_1$plot(estimates_by_age, additional_stats = "wtd")
fit_1$plot(estimates_by_age, additional_stats = "raw")
fit_1$plot(estimates_by_age, additional_stats = c("wtd","raw","mrp"))

# Compute and summarize the population estimate
estimates_popn <- fit_1$aggregate(poststrat_estimates)
fit_1$summary(estimates_popn)

# Plot population estimate
fit_1$plot(estimates_popn)
fit_1$plot(estimates_popn, additional_stats = "none")
fit_1$plot(estimates_popn, additional_stats = "wtd")
fit_1$plot(estimates_popn, additional_stats = "raw")
fit_1$plot(estimates_popn, additional_stats = c("wtd","raw","mrp"))

## End(Not run)

lauken13/mrpkit documentation built on Aug. 6, 2023, 3:42 a.m.

lauken13/mrpkit index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

lauken13/mrpkit Multilevel Regression with Post-Stratification

SurveyFit: SurveyFit In lauken13/mrpkit: Multilevel Regression with Post-Stratification

SurveyFit

Description

Methods

Public methods

Method new()

Usage

Arguments

Returns

Method fit()

Usage

Returns

Method map()

Usage

Returns

Method formula()

Usage

Returns

Method print()

Usage

Arguments

Returns

Method population_predict()

Usage

Arguments

Returns

Method aggregate()

Usage

Arguments

Returns

Method summary()

Usage

Arguments

Returns

Method plot()

Usage

Arguments

Returns

Method clone()

Usage

Arguments

Examples

Related to SurveyFit in lauken13/mrpkit...

R Package Documentation

Browse R Packages

We want your feedback!

lauken13/mrpkit
Multilevel Regression with Post-Stratification

SurveyFit: SurveyFit
In lauken13/mrpkit: Multilevel Regression with Post-Stratification

Method `new()`

Method `fit()`

Method `map()`

Method `formula()`

Method `print()`

Method `population_predict()`

Method `aggregate()`

Method `summary()`

Method `plot()`

Method `clone()`