new_tidylda | R Documentation |
tidylda
Since all three of tidylda
,
refit.tidylda
, and
predict.tidylda
call fit_lda_c
,
we need a way to format the resulting posteriors and other user-facing
objects consistently. This function does that.
new_tidylda(
lda,
dtm,
burnin,
is_prediction = FALSE,
alpha = NULL,
eta = NULL,
optimize_alpha = NULL,
calc_r2 = NULL,
calc_likelihood = NULL,
call = NULL,
threads
)
lda |
list output of |
dtm |
a document term matrix or term co-occurrence matrix of class |
burnin |
integer number of burnin iterations. |
is_prediction |
is this for a prediction (as opposed to initial fitting,
or update)? Defaults to |
alpha |
output of |
eta |
output of |
optimize_alpha |
did you optimize |
calc_r2 |
did the user want to calculate R-squared when calculating the
the model? If |
calc_likelihood |
did you calculate the log likelihood when making a call
to |
call |
the result of calling |
threads |
number of parallel threads |
Returns an S3 object of class tidylda
with the following slots:
beta
is a numeric matrix whose rows are the posterior estimates
of P(token|topic)
theta
is a numeric matrix whose rows are the posterior estimates of
P(topic|document)
lambda
is a numeric matrix whose rows are the posterior estimates of
P(topic|token), calculated using Bayes's rule.
See calc_lambda
.
alpha
is the prior for topics over documents. If optimize_alpha
is FALSE
, alpha
is what the user passed when calling
tidylda
. If optimize_alpha
is TRUE
,
alpha
is a numeric vector returned in the alpha
slot from a
call to fit_lda_c
.
eta
is the prior for tokens over topics. This is what the user passed
when calling tidylda
.
summary
is the result of a call to summarize_topics
call
is the result of match.call
called at the top
of tidylda
log_likelihood
is a tibble
whose columns are
the iteration and log likelihood at that iteration. This slot is only populated
if calc_likelihood = TRUE
r2
is a numeric scalar resulting from a call to
calc_rsquared
. This slot only populated if
calc_r2 = TRUE
In general, the arguments of this function should be what the user passed
when calling tidylda
.
burnin
is used only to determine whether or not burn in iterations
were used when fitting the model. If burnin > -1
then posteriors
are calculated using lda$Cd_mean
and lda$Cv_mean
respectively.
Otherwise, posteriors are calculated using lda$Cd_mean
and
lda$Cv_mean
.
The class of call
isn't checked. It's just passed through to the
object returned by this function. Might be useful if you are using this
function for troubleshooting or something.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.