webhose

Tools to Work with the 'webhose.io' 'API'

Description

The 'webhose.io' https://webhose.io/about 'API' provides access to structured web data feeds across vertical content domains. Their crawlers download the web, structure the data and index save it into domain-specific repositories that can be accessed on demand. Methods are provided to query and retrieve content from this 'API'.

TODO

Cover the rest of the wehbose.io API.

Covered are

What's in the tin?

The following functions are implemented:

Installation

devtools::install_github("hrbrmstr/webhose")
options(width=120)

Usage

library(webhose)

# current verison
packageVersion("webhose")

Make just one call and/or handle API pagination on your own:

res <- filter_posts("(China AND United) language:english site_type:news site:bloomberg.com", ts = 1213456)

str(res)

Auto-handle pagination (NOTE: you're more likelky to rip through your plan API credits this way):

res <- fetch_posts("(China AND United) language:english site_type:news site:bloomberg.com",ts = 1213456)

dplyr::glimpse(res)


hrbrmstr/webhose documentation built on May 30, 2019, 6:56 p.m.