ECB_press_conferences_tokens | R Documentation |
The pre-processed and tokenized version of the ECB_press_conferences corpus of press conferences. The processing involved the following steps:
Subset paragraphs shorter than 10 words
Removal of stop words
Part-of-speech tagging, following which only nouns, proper nouns and adjective were retained.
Detection and merging of frequent compound words
Frequency-based cleaning of rare and very common words
ECB_press_conferences_tokens
A quanteda::tokens object.
https://www.ecb.europa.eu/press/key/date/html/index.en.html.
ECB_press_conferences
LDA(ECB_press_conferences_tokens)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.