newscatcheR"
In newscatcheR: Programmatically Collect Normalized News from (Almost) Any Website

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

library(newscatcheR)
library(tidyRSS)

Overview

The package provides three simple functions for reading RSS feeds from news outlets and have them conveniently returned as a tibble.

The newscatcheR package provides a dataset of news sites and their rss feeds, together with some characteristics of the websites such as the topic, country or language of the website, and few functions explore and access the feeds from R.

Two functions that work as a wrapper around tidyRSS can be used to fetch the feed from a given website. Two additional functions can be used to conveniently browse the websites dataset.

get_news()

The first function get_news() returns a tibble of the rss feed of a given site.

# adding a small time delay to avoid simultaneous posts to the API
Sys.sleep(3)
get_news(website = "ycombinator.com", rss_table = package_rss)

get_headlines()

The second function get_headlines is a helper function that returns a tibble of just the headlines, instead of the full rss feed.

# adding a small time delay to avoid simultaneous posts to the API
Sys.sleep(3)
get_headlines(website = "ycombinator.com", rss_table = package_rss)

describe_url()

Because some website have multiple feeds divided by topics, describe_url(website) can be helpful to see the topics of a given website.

describe_url("bbc.com")

filter_urls()

Finally, filter_urls(topic, country, language ) can be used to browse the dataset by topic, country, or language.

filter_urls(topic = "tech", country = "IT", language = "it")

Use case

This package can be convenient if you need to fetch news from various websites for further analysis and you don't want to search manually for the URL of their RSS feeds.

Assuming we have the news sites we want to follow:

sites = c("bbc.com", "spiegel.de", "washingtonpost.com")

We can get a list of data frames with:

lapply(sites, get_news)

Any scripts or data that you put into this service are public.

newscatcheR documentation built on Sept. 20, 2023, 5:07 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

newscatcheR
Programmatically Collect Normalized News from (Almost) Any Website

newscatcheR"
In newscatcheR: Programmatically Collect Normalized News from (Almost) Any Website

Overview

get_news()

get_headlines()

describe_url()

filter_urls()

Use case

Try the newscatcheR package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

newscatcheR Programmatically Collect Normalized News from (Almost) Any Website

newscatcheR" In newscatcheR: Programmatically Collect Normalized News from (Almost) Any Website

Overview

get_news()

get_headlines()

describe_url()

filter_urls()

Use case

Try the newscatcheR package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

newscatcheR
Programmatically Collect Normalized News from (Almost) Any Website

newscatcheR"
In newscatcheR: Programmatically Collect Normalized News from (Almost) Any Website