An R wrapper for the quickscrape web scraping tool.
You need node.js
0.8 or later installed to run quickscraper. You can get it
here.
In R, run:
if(!require(devtools)) install.packge("devtools")
library(devtools)
install_github("noamross/quickscraper")
When first loaded, quickscraper
will ask to install its node module dependencies.
These are not bundled with the package because they contain system-specific
binaries which would not be allowed on CRAN, but by using the node package
manager after the R package install, the correct ones can be selected and
installed.
The node-wrapping and dependency-checking components of this package (In
R/node.r
)
will eventually be extracted into a separate package of utilties for wrapping
arbitrary node modules.
Usage is currently very basic, as the quickscrape
API is still evolving.
The function scrape()
scrapes information from a url or a set or URLs,
saving the results to disk or returning them to R in the form of a list.
For instance:
scrape('https://peerj.com/articles/409/')
See ?scrape
for more details and options.
Note that this repository contains both
quickscrape and
journal-scrapers. These
are handled using git subtree
.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.