Description Usage Arguments Details Value Examples
Compare articles with set of documents, i.e.: press releases.
1 2 3 4 5 6 7 | whe_similarity(wh, docs, progress = interactive())
## S3 method for class 'data.frame'
whe_similarity(wh, docs, progress = interactive())
## S3 method for class 'character'
whe_similarity(wh, docs, progress = interactive())
|
wh |
highlighted object returned by wh_collect, see examples. |
docs |
documents to compare the articles with. |
progress |
whether to show progress bar. |
This function uses the https://en.wikipedia.org/wiki/Jaccard_index
if a data.frame
is passed will append a column named similarity.*
where *
is the input document number.
If a character
vector is passed the function returns a character
vector.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | ## Not run:
library(webhose)
token <- wh_token("xXX-x0X0xX0X-00X")
token %>%
wh_news(q = '"World Economic Forum"') %>% # use highlight!
wh_collect() -> wef # collect results
library(rvest)
html <- read_html('http://reports.weforum.org/global-gender-gap-report-2017/press-release/')
# scrape Gender Gap Report press release
html %>%
html_nodes(".content") %>%
html_children() %>%
html_text() %>%
.[5:40] %>%
paste0(., collapse = "\n") -> pr
wef %>%
whe_similarity(pr) -> similarity
library(dplyr)
wef %>%
mutate(nmentions = whe_mentions(text)) -> similarity
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.