# textmodel_ca: Correspondence analysis of a document-feature matrix In quanteda/quanteda: Quantitative Analysis of Textual Data

## Description

textmodel_ca implements correspondence analysis scaling on a dfm. The method is a fast/sparse version of function ca.

## Usage

 1 2 textmodel_ca(x, smooth = 0, nd = NA, sparse = FALSE, residual_floor = 0.1) 

## Arguments

 x the dfm on which the model will be fit smooth a smoothing parameter for word counts; defaults to zero. nd Number of dimensions to be included in output; if NA (the default) then the maximum possible dimensions are included. sparse retains the sparsity if set to TRUE; set it to TRUE if x (the dfm) is too big to be allocated after converting to dense residual_floor specifies the threshold for the residual matrix for calculating the truncated svd.Larger value will reduce memory and time cost but might reduce accuracy; only applicable when sparse = TRUE

## Details

svds in the RSpectra package is applied to enable the fast computation of the SVD.

## Value

textmodel_ca() returns a fitted CA textmodel that is a special class of ca object.

## Note

You may need to set sparse = TRUE) and increase the value of residual_floor to ignore less important information and hence to reduce the memory cost when you have a very big dfm. If your attempt to fit the model fails due to the matrix being too large, this is probably because of the memory demands of computing the V \times V residual matrix. To avoid this, consider increasing the value of residual_floor by 0.1, until the model can be fit.

## Author(s)

Kenneth Benoit and Haiyan Wang

## References

Nenadic, O. & Greenacre, M. (2007). Correspondence Analysis in R, with Two- and Three-dimensional Graphics: The ca package. Journal of Statistical Software, 20(3).

coef.textmodel_lsa, ca
 1 2 3 dfmat <- dfm(data_corpus_irishbudget2010) tmod <- textmodel_ca(dfmat) summary(tmod)