ScrapeFunctions: Scraper Functions for General Texts

Description Usage Arguments Details Examples

Description

Functions to scrape political texts from https://www.presidency.ucsb.edu

Usage

1
2
3
4
5
6
7
scrapePlat(url, content)

scrapeInaug(n, content)

scrapeSOTU(url, content)

scrapeUCSB(n, content)

Arguments

url

A URL in quotes

content

CSS content that you want to scrape from site

n

number, from end of URL for Inaugural Addresses ONLY

Details

scrapePlat() is designed to scrape party platforms but has the potential to work for a general text on the https://www.presidency.ucsb.edu site. This returns a text string, which can then be converted to a data frame and manipulated as desired.

scrapeInaug() is like scrapePlat() but is designed for Inaugural Addresses. Each address URL for the UCSB presidency page is designed with "https://www.presidency.ucsb.edu/documents/inaugural-address-n" such that the last "n" is the unique identifier for a given address. Therefore, scrapeInaug(15, '.field-docs-content') will give you President Obama's 2012 address. The "n" value can be locaed by reading the last number on the URL

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
# Party Platforms
url <- "https://www.presidency.ucsb.edu/documents/2016-democratic-party-platform"
DemPP2016 <- scrapePlat(url, '.field-docs-content')

# Inaugural Address
# Source: https://www.presidency.ucsb.edu/documents/inaugural-address-15
# n = 15
Obama2012 <- scrapeInaug(15, '.field-docs-content')

# State of the Union
url <- "https://www.presidency.ucsb.edu/documents/address-before-joint-session-the-congress-the-state-the-union-19"
ObamaSOTU16 <- scrapeSOTU(url, '.field-docs-content')

# Any Text on UCSB
# https://www.presidency.ucsb.edu/node/321069
# n = 321069
ObamaFarewell <- scrapeUCSB(321069, '.field-docs-content')

lin-jennifer/poltextr documentation built on Dec. 30, 2020, 1:38 p.m.