Based on a slightly amended version of the regular expression used to classify news, and non-news in: “Exposure to ideologically diverse news and opinion on Facebook” by Bakshy, Messing, and Adamic. Science. 2015.
vector of URLs
Amendment: sport rather than sports
URL containing any of the following words is classified as soft news: "sport|entertainment|arts|fashion|style|lifestyle|leisure|celeb|movie|music|gossip|food|travel|horoscope|weather|gadget"
URL containing any of following words is classified as hard news: "politi|usnews|world|national|state|elect|vote|govern|campaign|war|polic|econ|unemploy|racis|energy|abortion|educa|healthcare|immigration"
Note that it is based on patterns existing in a small set of domains. See paper for details.
data.frame with 3 columns: url, not_news, news
1 2 3 4 5
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.