Tools for analysing temporally structured text collections, including tools for
reading large sets of texts in (via
pdftools
), and for time series
analysis of qualitative statistics such as word associations and topic models
(primarily via quanteda
and
topicmodels
.
System requirements (for linux):
For other systems, see respective documentation for
pdftools
) and
topicmodels
.
devtools::install_github ('mpadge/texttimetravel')
Load packages and a temporally-structured corpus to work with:
devtools::load_all (".", export_all = FALSE) library (quanteda) dat <- data_corpus_inaugural
library (texttimetravel) library (quanteda) dat <- data_corpus_inaugural #dat <- corpus_reshape (dat, to = "sentences") # if desired
(data_corpus_inaugural
is a sample corpus from quanteda
of inaugural speeches
of US presidents.) Then use quanteda
functions to
convert to desired tokenized form:
tok <- tokens (dat, remove_numbers = TRUE, remove_punct = TRUE, remove_separators = TRUE) tok <- tokens_remove (tok, stopwords("english"))
Keyword associations can be extracted with the ttt_keyness
function, which
relies on the quanteda::keyness
function, yet simplifies the interface by
allowing keyness statistics to be extracted with a single function call.
x <- ttt_keyness (tok, "politic*") head (x, n = 10) %>% knitr::kable() x <- ttt_keyness (tok, "school*") head (x, n = 10) %>% knitr::kable()
The function ttt_fit_topics
provides a convenient wrapper around the functions
provided by the
topicmodels
package, and
extends functionality via two additional parameters:
years
, allowing topic models to be fitted only to those portions of a
corpus corresponding to the specified years;topic
, allowing models to be fitted around a specified topic phrase.x <- ttt_fit_topics (tok, ntopics = 5) topicmodels::get_terms(x, 10) %>% knitr::kable()
x <- ttt_fit_topics (tok, years = 1789:1900, ntopics = 5) topicmodels::get_terms(x, 10) %>% knitr::kable()
x <- ttt_fit_topics (tok, topic = "nation", ntopics = 5) topicmodels::get_terms(x, 10) %>% knitr::kable()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.