augment.tidylda | R Documentation |
tidylda
objectsaugment
appends observation level model outputs.
## S3 method for class 'tidylda'
augment(
x,
data,
type = c("class", "prob"),
document_col = "document",
term_col = "term",
...
)
x |
an object of class |
data |
a tidy tibble containing one row per original document-token pair, such as is returned by tdm_tidiers with column names c("document", "term") at a minimum. |
type |
one of either "class" or "prob" |
document_col |
character specifying the name of the column that
corresponds to document IDs. Defaults to |
term_col |
character specifying the name of the column that
corresponds to term/token IDs. Defaults to |
... |
other arguments passed to methods,currently not used |
The key statistic for augment
is P(topic | document, token) =
P(topic | token) * P(token | document). P(topic | token) are the entries
of the 'lambda' matrix in the tidylda
object passed
with x
. P(token | document) is taken to be the frequency of each
token normalized within each document.
augment
returns a tidy tibble containing one row per document-token
pair, with one or more columns appended, depending on the value of type
.
If type = 'prob'
, then one column per topic is appended. Its value
is P(topic | document, token).
If type = 'class'
, then the most-probable topic for each document-token
pair is returned. If multiple topics are equally probable, then the topic
with the smallest index is returned by default.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.