get_totals: Get weighted percentages or totals

Description Usage Arguments Details Value Examples

View source: R/get_totals.R

Description

Takes a categorical variable and returns the weighted or unweighted percentage or total of each category. Can include a grouping variable. Can be used to compare weights versus one another.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
get_totals(
  var,
  df,
  wt = NULL,
  by = NULL,
  by_total = FALSE,
  percent = TRUE,
  include_unw = FALSE,
  digits = NULL,
  complete = TRUE,
  na.rm = FALSE
)

Arguments

var

character, indicating the name of the variable to be tabulated. Specify interactions by separating variable names with a colon.

df

The data.frame containing the variables to be tabulated.

wt

A character vector containing the name(s) of the weight variable(s) to be used in tabulating results. If nothing is passed, results will be unweighted.

by

For creating crosstabulations, an optional character variable containing the name of the variable to be crossed with var. Can pass multiple variables in as a vector.

by_total

logical indicating whether a "total" column should be returned in addition to columns defined by variables passed to by. Defaults to FALSE.

percent

Should the results be scaled as percentages? Defaults to TRUE. If FALSE, weighted totals are returned.

include_unw

Include unweighted frequencies in the output along with weighted. Defaults to FALSE

digits

The number of decimal points displayed. Defaults to value specified in options("digits").

complete

TRUE/FALSE: Should factor levels with no observations be included in the results? Defaults to TRUE.

na.rm

If FALSE, NA values in var are included in the results and included in the denominator for calculating percentages. If TRUE, they are excluded from any calculations. Defaults to FALSE.

Details

If no arguments are supplied to by, then the column names will be the weight names. If arguments are supplied to by, then the column names will be the categories of the grouping variable, and the output will have an additional column for the weight name.

Value

A data.frame with a column for the variable name, columns displaying percentages or totals, and additional columns as specified by arguments to this function.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
library(dplyr)
# Basic unweighted crosstab
get_totals("q1", dec13_excerpt)

# Totals instead of percentages
get_totals("q1", dec13_excerpt, percent = FALSE)

# Weighted crosstab
get_totals("receduc", dec13_excerpt, wt = "weight")

# Weighted crosstab by grouping variable
get_totals("q1", dec13_excerpt, wt = "weight", by = "receduc")

# Compare weights, including unweighted
# Let's make a fake weight by combining the landline and cellphone weights
dec13_excerpt <- dec13_excerpt %>% mutate(fake_weight = coalesce(llweight, cellweight))
get_totals("q1", dec13_excerpt, wt = c("weight", "fake_weight"), include_unw = TRUE)

# Use dplyr::filter along with complete = FALSE to remove unwanted categories from the base
get_totals("q1", dec13_excerpt %>% filter(q1 != "Don't know/Refused (VOL.)"), wt = "weight",
           complete = FALSE)

# Alternatively, filter unwanted categories out beforehand with dk_to_na
# and then use na.rm = TRUE
dec13_excerpt <- dec13_excerpt %>% mutate(q1 = dk_to_na(q1))
get_totals("q1", dec13_excerpt, wt = "weight", na.rm = TRUE)

pewresearch/pewmethods documentation built on March 27, 2020, 7:22 p.m.