textmodel_ca | R Documentation |
textmodel_ca
implements correspondence analysis scaling on a
dfm. The method is a fast/sparse version of function
ca.
textmodel_ca(x, smooth = 0, nd = NA, sparse = FALSE, residual_floor = 0.1)
x |
the dfm on which the model will be fit |
smooth |
a smoothing parameter for word counts; defaults to zero. |
nd |
Number of dimensions to be included in output; if |
sparse |
retains the sparsity if set to |
residual_floor |
specifies the threshold for the residual matrix for
calculating the truncated svd.Larger value will reduce memory and time cost
but might reduce accuracy; only applicable when |
svds in the RSpectra package is applied to enable the fast computation of the SVD.
textmodel_ca()
returns a fitted CA textmodel that is a special
class of ca object.
You may need to set sparse = TRUE
) and
increase the value of residual_floor
to ignore less important
information and hence to reduce the memory cost when you have a very big
dfm.
If your attempt to fit the model fails due to the matrix being too large,
this is probably because of the memory demands of computing the V
\times V
residual matrix. To avoid this, consider increasing the value of
residual_floor
by 0.1, until the model can be fit.
Kenneth Benoit and Haiyan Wang
Nenadic, O. & Greenacre, M. (2007). Correspondence Analysis in R, with Two- and Three-dimensional Graphics: The ca package. Journal of Statistical Software, 20(3). \Sexpr[results=rd]{tools:::Rd_expr_doi("10.18637/jss.v020.i03")}
coef.textmodel_lsa()
, ca
library("quanteda")
dfmat <- dfm(tokens(data_corpus_irishbudget2010))
tmod <- textmodel_ca(dfmat)
summary(tmod)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.