get_news: Retrieve news articles

Description Usage Arguments Details Examples

View source: R/get_news.R

Description

get_news returns news articles from the Newsriver API matching a user provided search query.

Usage

1
2
get_news(query, from = NULL, to = NULL, language = "en",
  limit = 100, api_token = NULL, ua = NULL)

Arguments

query

Character string, specifying the query to be searched when calling the Newsriver API. Many fields of retrieved articles can be searched, but query should only be used to search the title and text fields (other fields are handled by default or specified as separate parameters to get_news). Search queries must be valid Lucene query strings.

To build valid search queries, search terms can be passed into the title and text fields using a colon. For example, to search for any articles containing the word "Google" in the text, use query = "text:Google", or to search for any articles with "Twitter" in the title use query = "title:Twitter". Multiple search terms/fields can be placed together separated by "OR", "AND", and "NOT" operators (which perform as expected) to build more complex queries. To group multiple search terms in one field, use parentheses. For example, to search for any articles that contain "Google" in the title, and "Cloud" or "BigQuery" in the text, use query = "title:Google AND text:(Cloud OR BigQuery)".

To search exact phrases, use double quotes. To do this, either wrap single quotes around a search query using double quotes, e.g., query = 'title:"RStudio Connect"' or escape each internal double quote with a single backslash, e.g., query = "\"RStudio Connect\"". Note: (i) search queries are case sensitive, (ii) spaces behave like OR operators, (iii) encoded queries cannot exceed 414 characters. For more examples and information on building queries, see the official Newsriver Code Book.

from, to

Character string, specifying the date range of your search. Must be in the "%Y-%m-%d" format. to defaults to the current date and from defaults to one month prior that (i.e., the past month). Note: Newsriver can only retrieve articles from the past year.

language

Character string, specifying the language of the articles to return. Must be in the ISO 639-1 two-letter code format (e.g., "en", "it", "es", etc.).

limit

Integer, specifying the maximum number of results to return per day between the supplied to and from dates. Accepts values from 1 to 100 (e.g., a search period of 10 days can return a maximum of 1000 articles).

api_token, ua

Character string, specifying a Newsriver API token and user agent. Defaults to the values set using store_creds.

Details

Search queries

get_news calls the Newsriver API by generating custom HTTP GET requests. These requests are composed of multiple query parameters (see the Newsriver API reference manual). While many search fields of the Newsriver query parameter can be searched, the query parameter of get_news should only be used to search the title and text fields of new articles. This is because other fields are handled by default or passed as alternate arguments to get_news (e.g., language).

Date sequences

Results from the Newsriver API are limited to a maximum of 100 articles per GET request. In order to return the maximum number of results, get_news creates a sequence of search dates, by day, specified between the from and to parameters. Each search date from the sequence is then combined with the other query parameters to create a unique GET request for that date. The results from each GET request are then combined and returned.

Rate limiting

Rate limiting is handled automatically by get_news.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
## Not run: 
get_news("Google")

get_news("title:Google", language = "es", limit = 50)

get_news("title:\"Google Cloud\"", from = "2018-12-01", to = "2019-05-01")

get_news("title:Google AND text:\"Google Cloud\"")

## End(Not run)

MikeJohnPage/newsrivr documentation built on Jan. 4, 2021, 7:48 p.m.