textmodel_ca implements correspondence analysis scaling on a
dfm. The method is a fast/sparse version of function ca.
textmodel_ca(x, smooth = 0, nd = NA, sparse = FALSE, residual_floor = 0.1)
the dfm on which the model will be fit
a smoothing parameter for word counts; defaults to zero.
Number of dimensions to be included in output; if
retains the sparsity if set to
specifies the threshold for the residual matrix for
calculating the truncated svd.Larger value will reduce memory and time cost
but might reduce accuracy; only applicable when
svds in the RSpectra package is applied to enable the fast computation of the SVD.
textmodel_ca() returns a fitted CA textmodel that is a special
class of ca object.
You may need to set
sparse = TRUE) and
increase the value of
residual_floor to ignore less important
information and hence to reduce the memory cost when you have a very big
If your attempt to fit the model fails due to the matrix being too large,
this is probably because of the memory demands of computing the V
\times V residual matrix. To avoid this, consider increasing the value of
residual_floor by 0.1, until the model can be fit.
Kenneth Benoit and Haiyan Wang
Nenadic, O. & Greenacre, M. (2007). Correspondence Analysis in R, with Two- and Three-dimensional Graphics: The ca package. Journal of Statistical Software, 20(3). doi: 10.18637/jss.v020.i03
library("quanteda") dfmat <- dfm(tokens(data_corpus_irishbudget2010)) tmod <- textmodel_ca(dfmat) summary(tmod)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.