In Shusei-E/keyATM-OLD: Keyword Assisted Topic Models

knitr::opts_chunk$set(eval = T, echo = TRUE)

Preparing documents and covariates

Please read Preparation for the reading of documents and creating a list of keywords. We use bills data we prapared (documents and keywords).

keyATM takes matrix or data.frame (tibble) type covariates data. If you have $D$ documents and $M$ covariates, the matrix should be $D \times M$. In this example, we have a dummy variable that indicates party identification.

library(keyATM)
data(keyATM_data_bills)
bills_cov <- keyATM_data_bills$cov
dim(bills_cov)  # We have 140 documents and a single covariate

Please make sure that the order of covariates is the same as the order of documents.

library(quanteda)
bills_dfm <- keyATM_data_bills$doc_dfm  # quanteda object
keyATM_docs <- keyATM_read(bills_dfm)

bills_keywords <- list(
                       Education = c("education", "child", "student"),
                       Law       = c("court", "law", "attorney"),
                       Health    = c("public", "health", "program"),
                       Drug      = c("drug", "treatment")
                      )

Fitting the model

out <- keyATM(
              docs              = keyATM_docs,    # text input
              no_keyword_topics = 3,              # number of topics without keywords
              keywords          = bills_keywords, # keywords
              model             = "covariates",   # select the model
              model_settings    = list(covariates_data = bills_cov,
                                       covariates_formula = ~ RepParty),
              options           = list(seed = 50)
             )