URL_PDF_Text_Pull: PDF scrape text from a list of pdf urls generated from a...
In OJWatson/waities: Scraping package with growing purposes

URL_PDF_Text_Pull takes a url and scrapes your wanted urls from a minimum css selector. These urls are pdfs which are then downloaded and text scraped for 2 words that appear after a provided string.

  URL_PDF_Text_Pull(url = "http://www.who.int/globalchange/resources/country-profiles/en/",
  css = ".a_z a", string = "Population (2013)", pdf.dir.dump,
  downloaded = FALSE, col.names = c("Number", "Size"),
  search.length = 200, words = c(2, 3))

`url`	Url containing links to pdfs
`css`	Minimal css selector for links in url
`string`	String which is to eb matched from pdf.
`pdf.dir.dump`	Directory path where pdfs are downloaded to
`downloaded`	Boolean determining if `URL_PDF_Text_Pull` has already been called and thus there is no need to redownload pdfs. Default = FALSE
`col.names`	vector of length 2 for data frame result names
`search.length`	Integer giving the length of the pdf text to search after the occurence of string
`words`	Vector of integer determining which words to store from the search length. N.B. function will fail if the number of words is greater than the actual number of words that appear after the search string search length