Description Usage Arguments Value
URL_PDF_Text_Pull
takes a url and scrapes your wanted urls from a minimum css selector. These urls are
pdfs which are then downloaded and text scraped for 2 words that appear after a provided string.
1 2 3 4 | URL_PDF_Text_Pull(url = "http://www.who.int/globalchange/resources/country-profiles/en/",
css = ".a_z a", string = "Population (2013)", pdf.dir.dump,
downloaded = FALSE, col.names = c("Number", "Size"),
search.length = 200, words = c(2, 3))
|
url |
Url containing links to pdfs |
css |
Minimal css selector for links in url |
string |
String which is to eb matched from pdf. |
pdf.dir.dump |
Directory path where pdfs are downloaded to |
downloaded |
Boolean determining if |
col.names |
vector of length 2 for data frame result names |
search.length |
Integer giving the length of the pdf text to search after the occurence of string |
words |
Vector of integer determining which words to store from the search length. N.B. function will fail if the number of words is greater than the actual number of words that appear after the search string search length |
list of dataframes of scraped information
list of dataframes of scraped information
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.