In jdjohn215/pollster: Calculate Crosstab and Topline Tables of Weighted Survey Data

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

library(pollster)
library(dplyr)
library(knitr)
library(ggplot2)

Crosstabs can come in wide or long format. Each is useful, depending on your purpose. Wide data is best for display tables. Long data is usually better for making plots, for instance..

Here is a wide table.

crosstab(df = illinois, x = sex, y = educ6, weight = weight) %>%
  kable()

And here is long format.

crosstab(df = illinois, x = sex, y = educ6, weight = weight, format = "long")

By default, row percentages are used. You can also explicitly choose cell or column percentages using the pct_type argument. I discourage the use of column percentages--it's better to just flip the x and y variables and make row percents--but the option is included to match functionality provided by other standard statistical software.

# cell percentages
crosstab(df = illinois, x = sex, y = educ6, weight = weight, pct_type = "cell")

# column percentages
crosstab(df = illinois, x = sex, y = educ6, weight = weight, pct_type = "column")

To make a graph, just feed your tibble output to a ggplot2 function.

crosstab(df = illinois, x = sex, y = educ6, weight = weight, format = "long") %>%
  ggplot(aes(x = educ6, y = pct, fill = sex)) +
  geom_bar(stat = "identity", position = position_dodge()) +
  labs(title = "Educational attainment of the Illinois adult population by gender")

Margin of error

How the margin of error is calculated

The margin of error is calculated including the design effect of the sample weights, using the following formula:

sqrt(design effect)*zscore*sqrt((pct*(1-pct))/(n-1))*100

The design effect is calculated using the formula length(weights)*sum(weights^2)/(sum(weights)^2).

Get at topline table with the margin of error in a separate column using the moe_crosstab function. By default, a z-score of 1.96 (95% confidence interval is used). Supply your own desired z-score using the zscore argument. Only row and cell percents are supported. By default, the table format is long because I anticipate making visualizations will be the most common use-case for this graphic.

moe_crosstab(illinois, educ6, voter, weight)

A wide format table looks like this.

moe_crosstab(illinois, educ6, voter, weight, format = "wide")

ggplot2 offers multiple ways to visualize the margin of error. Here is one good option. (Please note, if you don't have ggplot2 >= 3.3.0 you'll get an error message.)

illinois %>%
  filter(year == 2016) %>%
  moe_crosstab(educ6, voter, weight) %>%
  ggplot(aes(x = pct, y = educ6, xmin = (pct - moe), xmax = (pct + moe),
             color = voter)) +
  geom_pointrange(position = position_dodge(width = 0.2))

Special case, the x-variable identifies survey waves

If the x-variable in your crosstab uniquely identifies survey waves for which the weights were independently generated, it is best practice to calculate the design effect independently for each wave. moe_wave_crosstab does just that. All of the arguments remain the same as in moe_crosstab.

moe_wave_crosstab(df = illinois, x = year, y = rv, weight = weight)

jdjohn215/pollster documentation built on May 19, 2023, 4:34 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

jdjohn215/pollster
Calculate Crosstab and Topline Tables of Weighted Survey Data

In jdjohn215/pollster: Calculate Crosstab and Topline Tables of Weighted Survey Data

Margin of error

How the margin of error is calculated

Special case, the x-variable identifies survey waves

R Package Documentation

Browse R Packages

We want your feedback!

jdjohn215/pollster Calculate Crosstab and Topline Tables of Weighted Survey Data

In jdjohn215/pollster: Calculate Crosstab and Topline Tables of Weighted Survey Data

Margin of error

How the margin of error is calculated

Special case, the x-variable identifies survey waves

R Package Documentation

Browse R Packages

We want your feedback!

jdjohn215/pollster
Calculate Crosstab and Topline Tables of Weighted Survey Data