In decktools/votesmart: Wrapper for the Project 'VoteSmart' API

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

library(votesmart)

The first step to using the votesmart package is to register an API key and store it in an environment variable by following these instructions.

Let's make sure our API key is set.

# If our key is not registered in this environment variable,
# the result of `Sys.getenv("VOTESMART_API_KEY")` will be `""` (i.e. a string of `nchar` 0)
key <- Sys.getenv("VOTESMART_API_KEY")

key_exists <- (nchar(key) > 0)

if (!key_exists) knitr::knit_exit()

We'll also attach dplyr for working with dataframes.

suppressPackageStartupMessages(library(dplyr))
conflicted::conflict_prefer("filter", "dplyr")

Motivation

Some of these functions are necessary precursors to obtain data you might want. For instance, in order to get candidates' ratings by SIGs, you'll need to get office_level_ids in order to get office_ids, which is a required argument to get candidate information using candidates_get_by_office_state. We'll go through what might be a typical example of how you might use the votesmart package.

Get Candidate Info

There are currently three functions for getting data on VoteSmart candidates: candidates_get_by_lastname, candidates_get_by_levenshtein, and candidates_get_by_office_state.

Let's search for former US House Rep Barney Frank using candidates_get_by_lastname.

From ?candidates_get_by_lastname, this function's defaults are:

candidates_get_by_lastname(
  last_names,
  election_years = lubridate::year(lubridate::today()),
  stage_ids = "",
  all = TRUE,
  verbose = TRUE
)

Since the default election year is the current year and Barney Frank left office in 2013, we'll specify a few years in which he ran for office.

(franks <-
  candidates_get_by_lastname(
    last_names = "frank",
    election_years = c(2000, 2004)
  )
)

Looking at the first_name column, are a number of non-Barneys returned. We can next filter our results to Barney.

(barneys <-
  franks %>%
  filter(first_name == "Barney") %>%
  select(
    candidate_id, first_name, last_name,
    election_year, election_state_id, election_office
  )
)

The two rows returned correspond to the two election_years we specified. Each candidate gets their own unique candidate_id, which we can pull out.

(barney_id <-
  barneys %>%
  pull(candidate_id) %>%
  unique()
)

Get Candidates' Ratings

One of the most powerful things about VoteSmart is its wealth of information about candidates' positions on issues as rated by a number of Special Interest Groups, or SIGs.

Given a candidate_id, we can ask for those ratings using rating_get_candidate_ratings.

(barney_ratings <-
  rating_get_candidate_ratings(
    candidate_ids = barney_id,
    sig_ids = "" # All SIGs
  )
)

There are a lot of columns here because some ratings are tagged with multiple categories.

main_cols <- c("rating", "category_name_1", "sig_id", "timespan")

We'll filter to Barney's ratings on the environment using just the first category name.

(barney_on_env <-
  barney_ratings %>%
  filter(category_name_1 == "Environment") %>%
  select(main_cols)
)

Something to be aware of is that some SIGs give ratings as letter grades:

barney_ratings %>%
  filter(
    stringr::str_detect(rating, "[A-Z]")
  ) %>%
  select(rating, category_name_1)

But using just Barney's number grades, we can get his average rating on this category per timespan:

barney_on_env %>%
  group_by(timespan) %>%
  summarise(
    avg_rating = mean(as.numeric(rating), na.rm = TRUE)
  ) %>%
  arrange(desc(timespan))

Keep in mind that these are ratings given by SIGs, which often have very different baseline stances on issues. For example, a pro-life group might give a candidate a rating of 0 whereas a pro-choice group might give that same candidate a 100.

barney_ratings %>%
  filter(category_name_1 == "Abortion") %>%
  select(
    rating, sig_id, category_name_1
  )

SIGs

When it comes to the Special Interest Groups themselves, the result of rating_get_candidate_ratings only supplies us with a sig_id.

We can get more information about these SIGs given these IDs with rating_get_sig.

(some_sigs <-
  barney_ratings %>%
  pull(sig_id) %>%
  unique() %>%
  sample(3)
)

rating_get_sig(
  sig_ids = some_sigs
)

Or, if we don't yet know any sig_ids, we can get a dataframe of them with the function rating_get_sig_list.

That function requires a vector of issue category_ids, however, so let's first get a vector of some category_ids.

(category_df <-
  rating_get_categories(
    state_ids = NA # NA for national
  ) %>%
  distinct() %>%
  sample_n(nrow(.)) # Sampling so we can see multiple categories in the 10 rows shown here
)

Now we can get our dataframe of SIGs given some categories.

(some_categories <- category_df$category_id %>% sample(3))

(sigs <-
  rating_get_sig_list(
    category_ids = some_categories,
    state_ids = NA
  ) %>%
  select(sig_id, name, category_id, state_id) %>%
  sample_n(nrow(.))
)

We already have the category names corresponding to those category_ids in our category_df, so we can join category_df onto sigss to attach category_name_1s to each of those SIGs.

sigs %>%
  rename(
    sig_name = name
  ) %>%
  left_join(
    category_df,
    by = c("state_id", "category_id")
  ) %>%
  rename(
    category_name_1 = name
  ) %>%
  sample_n(nrow(.))