README.md
In unimi-dse/1ed5ff0d: BBC News article text analysis

bbcnews Package

1ed5ff0d

bbcnews package is used for scrapping article text from BBC News website, cleaning, analysing and plotting it

# first install the R package "devtools" if not installed
devtools::install_github('unimi-dse/1ed5ff0d')

# load the package
require(bbcnews)

Function | Description | Examples ---------------------|-----------------------------------------------------------------------------|------------------------ get_text() | Scrap BBC News article headline and body text | get_text("world-us-canada-51381625") clean_text() | Clean and transfrom article text (stem text, remove numbers, stopwords etc.)| clean_text("world-us-canada-51381625") analyze_text() | Build a dataframe with the most frequent words and plot it with a wordcloud | analyze_text("world-us-canada-51381625", 50, TRUE) assoc_word() | Find associations (terms which are correlated) with the input word | assoc_word("world-us-canada-51408704", "trump", 0.5) | analyze_sentiment()| Analyze sentiment distribution across the article and plot it | analyze_sentiment("world-us-canada-51381625")

Type ?Function for a complete description of the functions (for example, ?get_text())

Argument url_end is used in all the functions. url_end is a character string, an ending part of BBC News particular atricle URL (everything after https://www.bbc.com/news/). For example, article URL is "https://www.bbc.com/news/world-us-canada-51381625". Only "world-us-canada-51381625" should be pasted in the function

Note: Please, check that URL (url_end) exists before running the function, otherwise you will get an "Error in open.connection(x, "rb") : HTTP error 404". Please, insert URLs of the BBC News (not Sports, Travel etc.) articles in English only.

rvest, xml2, tm, SnowballC, wordcloud, sentimentr, ggplot2, RColorBrewer, utils, devtools, dplyr, magrittr

Author: Angelina Khatiwada

Date: February 2020

unimi-dse/1ed5ff0d documentation built on Feb. 10, 2020, 12:21 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Tweet to @rdrrHQ

GitHub issue tracker

ian@mutexlabs.com