augment.tidylda: Augment method for 'tidylda' objects

View source: R/tidy-methods.R

augment.tidyldaR Documentation

Augment method for tidylda objects

Description

augment appends observation level model outputs.

Usage

## S3 method for class 'tidylda'
augment(
  x,
  data,
  type = c("class", "prob"),
  document_col = "document",
  term_col = "term",
  ...
)

Arguments

x

an object of class tidylda

data

a tidy tibble containing one row per original document-token pair, such as is returned by tdm_tidiers with column names c("document", "term") at a minimum.

type

one of either "class" or "prob"

document_col

character specifying the name of the column that corresponds to document IDs. Defaults to "document".

term_col

character specifying the name of the column that corresponds to term/token IDs. Defaults to "term".

...

other arguments passed to methods,currently not used

Details

The key statistic for augment is P(topic | document, token) = P(topic | token) * P(token | document). P(topic | token) are the entries of the 'lambda' matrix in the tidylda object passed with x. P(token | document) is taken to be the frequency of each token normalized within each document.

Value

augment returns a tidy tibble containing one row per document-token pair, with one or more columns appended, depending on the value of type.

If type = 'prob', then one column per topic is appended. Its value is P(topic | document, token).

If type = 'class', then the most-probable topic for each document-token pair is returned. If multiple topics are equally probable, then the topic with the smallest index is returned by default.


tidylda documentation built on July 26, 2023, 5:34 p.m.