get_emails: Get emails and its contents

Description Usage Arguments Details Value Author(s) See Also Examples

Description

Get the content of Hillary Rodham Clinton's emails by release.

Usage

1
get_emails(release, save.dir = getwd(), extractor, ...)

Arguments

release

Name of the batch of release of emails; see details.

save.dir

Directory where to save the extracted text defaults to getwd()

extractor

Full path to pdf extractor pdftotext, see details.

...

additional parameters to pass to pdftotext.

Details

Below are the valid values for release; follows the WSJ naming convention.

The extractor argument is the full path to your pdftotext.exe extractor; visit xpdf to download or try get_xpdf which attempts to download and unzip the text to pdf extractor. See examples.

Value

Fetches email zip file from the WSJ and extract text files in save.dir, returns full path to directory that contains parsed txt files.

Author(s)

John Coene jcoenep@gmail.com

See Also

get_xpdf, download_emails, extract_emails

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
## Not run: 
# get xpdf extractor
ext <- get_xpdf()

# create
dir.create("emails")

# get emails released in august
emails_aug <- get_emails(release = "August", save.dir = "./emails",
                     extractor = ext)

# use manually downloaded extractor
# ext <- "C:/xpdfbin-win-3.04/bin64/pdftotext.exe"

# get emails related to Benghazi released in December
emails_bengh <- get_emails(release = "Benghazi", extractor = ext,
                           save.dir = "./emails")

## End(Not run)

rodham documentation built on May 1, 2019, 10:21 p.m.