analyze_text: Analyze BBC News article

Description Usage Arguments Value Note Examples

View source: R/AnalyzeText.R

Description

Build a dataframe with the input number of the most frequent words in the article (word frequency in decreasing order) Plot it with a wordcloud (optional)

Usage

1
analyze_text(url_end, num_word, do_plot = FALSE)

Arguments

url_end

character string, an ending part of BBC News particular atricle URL (everything after https://www.bbc.com/news/). For example, article URL is "https://www.bbc.com/news/world-us-canada-51381625". Only "world-us-canada-51381625" should be pasted

num_word

numeric, the number of words to be included in dataframe and plot. Recommended value for plotting is 50 or 100 (not more than 200) as the plot gets distorted with the larger values and the word frequency becomes less significant

do_plot

logical, if TRUE, wordcloud is rendered. By default plot is not rendered as do_plot = FALSE

Value

h_df_word_freq - a dataframe with the num_words frequent words in the article (two columns: words and frequency of appearance in the text). Optional: a plot of the dataframe is rendered

Note

Please, check that URL (url_end) exists before running the function, otherwise you will get an "Error in open.connection(x, "rb") : HTTP error 404". Please, insert URLs of the articles in English only. Only for BBC News, not BBC Sports , Travel, Worklife, etc.

Examples

1
2
3
analyze_text("world-us-canada-51381625", 100, FALSE)
analyze_text("entertainment-arts-51398105", 50, TRUE)
analyze_text("world-us-canada-51408704", 200)

unimi-dse/1ed5ff0d documentation built on Feb. 10, 2020, 12:21 a.m.