knitr::opts_knit$set( stop_on_error = 2L ) knitr::opts_chunk$set( fig.height = 11, fig.width = 7 ) options(cite = FALSE)
Logit Regression for Dichotomous Dependent Variables with Survey Weights with
logit.survey
.
Use logit regression to model binary dependent variables specified as a function of a set of explanatory variables.
rm(list=ls(pattern="\\.out")) suppressWarnings(suppressMessages(library(Zelig))) set.seed(1234)
Our example dataset comes from the survey package:
data(api, package = "survey")
In this example, we will estimate a model using the percentages of students who receive subsidized lunch and the percentage who are new to a school to predict whether each California public school attends classes year round. We first make a numeric version of the variable in the example dataset, which you may not need to do in another dataset.
library(survey) apistrat.design <- svydesign(ids = ~1, weights = ~pw, data = apistrat) m1 <- svyglm(yr.rnd ~ meals + mobility, design = apistrat.design, family = quasibinomial()) summary(m1)
Set explanatory variables to their observed values, and set a high (80th percentile) and low (20th percentile) value for "meals," the percentage of students who receive subsidized meals:
library(smargins) m.sm1 <- smargins(m1, meals = quantile(meals, c(0.2, 0.8))) summary(m.sm1)
Generate first differences for the effect of high versus low "meals" on the probability that a school will hold classes year round:
m.sm1.diff <- scompare(m.sm1, "meals") summary(m.sm1.diff)
Generate a second set of fitted values and a plot:
library(ggplot2) ggplot(m.sm1.diff, aes(x = .smargin_qi)) + geom_density(fill = "blue", alpha = 0.25)
Suppose that the survey house that provided the dataset excluded probability weights but made other details about the survey design available. We can still estimate a model without probability weights that takes instead variables that identify each the stratum and/or cluster from which each observation was selected and the size of the finite sample from which each observation was selected.
apiclus.design <- svydesign(ids = ~1, strata = ~stype, fpc = ~fpc, data = apistrat) m2 <- svyglm(yr.rnd ~ meals + mobility, design = apiclus.design, family = quasibinomial()) summary(m2)
The coefficient estimates from this model are identical to point estimates in the previous example, but the standard errors are smaller.
m.sm2 <- smargins(m2, meals = quantile(apistrat$meals, c(0.2, 0.8))) summary(m.sm2)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.