Description Usage Arguments Details Value Note Examples
Clean documents in Corpus which include BBC News article text
1 | clean_text(url_end)
|
url_end |
character string, an ending part of BBC News particular atricle URL (everything after https://www.bbc.com/news/). For example, article URL is "https://www.bbc.com/news/world-us-canada-51381625". Only "world-us-canada-51381625" should be pasted |
Transform the symbols like "/", "\" into space. Stem text, remove numbers, remove english stopwords, extra spaces and punctuation, convert text to lower case
art_c - Corpus with cleaned and transformed text documents. Corpus represents a collection of text documents with an article text (each article paragraph as a single document in Corpus)
Please, check that URL (url_end) exists before running the function, otherwise you will get an "Error in open.connection(x, "rb") : HTTP error 404". Please, insert URLs of articles in English only. Only for BBC News, not BBC Sports , Travel, Worklife, etc.
1 2 | clean_text("world-us-canada-51381625")
clean_text("entertainment-arts-35232060")
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.