ScrapeFunctions: Scraper Functions for General Texts
In lin-jennifer/poltextr: Easy Access Political Texts in R

Description Usage Arguments Details Examples

Functions to scrape political texts from https://www.presidency.ucsb.edu

scrapePlat(url, content)

scrapeInaug(n, content)

scrapeSOTU(url, content)

scrapeUCSB(n, content)

`url`	A URL in quotes
`content`	CSS content that you want to scrape from site
`n`	number, from end of URL for Inaugural Addresses ONLY

scrapePlat() is designed to scrape party platforms but has the potential to work for a general text on the https://www.presidency.ucsb.edu site. This returns a text string, which can then be converted to a data frame and manipulated as desired.

scrapeInaug() is like scrapePlat() but is designed for Inaugural Addresses. Each address URL for the UCSB presidency page is designed with "https://www.presidency.ucsb.edu/documents/inaugural-address-n" such that the last "n" is the unique identifier for a given address. Therefore, scrapeInaug(15, '.field-docs-content') will give you President Obama's 2012 address. The "n" value can be locaed by reading the last number on the URL

# Party Platforms
url <- "https://www.presidency.ucsb.edu/documents/2016-democratic-party-platform"
DemPP2016 <- scrapePlat(url, '.field-docs-content')

# Inaugural Address
# Source: https://www.presidency.ucsb.edu/documents/inaugural-address-15
# n = 15
Obama2012 <- scrapeInaug(15, '.field-docs-content')

# State of the Union
url <- "https://www.presidency.ucsb.edu/documents/address-before-joint-session-the-congress-the-state-the-union-19"
ObamaSOTU16 <- scrapeSOTU(url, '.field-docs-content')

# Any Text on UCSB
# https://www.presidency.ucsb.edu/node/321069
# n = 321069
ObamaFarewell <- scrapeUCSB(321069, '.field-docs-content')