Get House of Representatives chronology

Uses webdriver to interact with the Chief Clerks Office website to scrape the Chronological List of House Representatives.

Outputs data-raw/chronology-page.html

Required setup:

# install.packages("rstudio/webdriver")
# webdriver::install_phantomjs()
library(rvest)
library(tidyverse)
library(here)
library(webdriver)

Visit page and expand all sessions, save HTML

pjs <- run_phantomjs()
ses <- Session$new(port = pjs$port)
ses$go("https://www.oregonlegislature.gov/chief-clerk/Pages/representatives.aspx")
ses$takeScreenshot()
leg_sessions <- ses$findElements(xpath = "//td/a")
walk(leg_sessions, ~ .$click())

# this can take awhile so wait before getting source
html <- ses$getSource()
working <- str_detect(html, "Working on it")

while(working){
  Sys.sleep(5)
  html <- ses$getSource()
  working <- str_detect(html, "Working on it")
}
retrieved <- Sys.time()
retrieved_line <- paste0(
  "<!-- Retrieved ", 
  format(retrieved, "%F %T %Z"), 
  " -->"
)

write_lines(c(retrieved_line, html), here("data-raw", "chronology-page.html"))


or-house-vis/history documentation built on May 15, 2019, 1:11 p.m.