get_totals: Get weighted percentages or totals
In pewresearch/pewmethods: Pew Research Center Methods Miscellaneous Functions

Description Usage Arguments Details Value Examples

Takes a categorical variable and returns the weighted or unweighted percentage or total of each category. Can include a grouping variable. Can be used to compare weights versus one another.

get_totals(
  var,
  df,
  wt = NULL,
  by = NULL,
  by_total = FALSE,
  percent = TRUE,
  include_unw = FALSE,
  digits = NULL,
  complete = TRUE,
  na.rm = FALSE
)

`var`	`character`, indicating the name of the variable to be tabulated. Specify interactions by separating variable names with a colon.
`df`	The `data.frame` containing the variables to be tabulated.
`wt`	A `character` vector containing the name(s) of the weight variable(s) to be used in tabulating results. If nothing is passed, results will be unweighted.
`by`	For creating crosstabulations, an optional `character` variable containing the name of the variable to be crossed with `var`. Can pass multiple variables in as a vector.
`by_total`	`logical` indicating whether a "total" column should be returned in addition to columns defined by variables passed to `by`. Defaults to `FALSE`.
`percent`	Should the results be scaled as percentages? Defaults to `TRUE`. If `FALSE`, weighted totals are returned.
`include_unw`	Include unweighted frequencies in the output along with weighted. Defaults to `FALSE`
`digits`	The number of decimal points displayed. Defaults to value specified in `options("digits")`.
`complete`	TRUE/FALSE: Should factor levels with no observations be included in the results? Defaults to `TRUE`.
`na.rm`	If `FALSE`, `NA` values in `var` are included in the results and included in the denominator for calculating percentages. If `TRUE`, they are excluded from any calculations. Defaults to `FALSE`.

If no arguments are supplied to by, then the column names will be the weight names. If arguments are supplied to by, then the column names will be the categories of the grouping variable, and the output will have an additional column for the weight name.

A data.frame with a column for the variable name, columns displaying percentages or totals, and additional columns as specified by arguments to this function.

library(dplyr)
# Basic unweighted crosstab
get_totals("q1", dec13_excerpt)

# Totals instead of percentages
get_totals("q1", dec13_excerpt, percent = FALSE)

# Weighted crosstab
get_totals("receduc", dec13_excerpt, wt = "weight")

# Weighted crosstab by grouping variable
get_totals("q1", dec13_excerpt, wt = "weight", by = "receduc")

# Compare weights, including unweighted
# Let's make a fake weight by combining the landline and cellphone weights
dec13_excerpt <- dec13_excerpt %>% mutate(fake_weight = coalesce(llweight, cellweight))
get_totals("q1", dec13_excerpt, wt = c("weight", "fake_weight"), include_unw = TRUE)

# Use dplyr::filter along with complete = FALSE to remove unwanted categories from the base
get_totals("q1", dec13_excerpt %>% filter(q1 != "Don't know/Refused (VOL.)"), wt = "weight",
           complete = FALSE)

# Alternatively, filter unwanted categories out beforehand with dk_to_na
# and then use na.rm = TRUE
dec13_excerpt <- dec13_excerpt %>% mutate(q1 = dk_to_na(q1))
get_totals("q1", dec13_excerpt, wt = "weight", na.rm = TRUE)