lca: Lexical Correspondence Analysis

Description Usage Arguments Details Value Examples

View source: R/lca.R

Description

correspondence analysis of textual data

Usage

1
2
3
4
5
6
7
lca(
  df,
  docid_field = NULL,
  text_field = NULL,
  min_docfreq = 0.5,
  max_docfreq = 99
)

Arguments

df

a dataframe with at least a column with textual data and a column with documents' ID

docid_field

name of the column (in quotation marks) containing the IDs of the documents (default NULL)

text_field

name of the column (in quotation marks) containing textual data

min_docfreq

minimum values of a feature's document frequency, below which features will be removed (default 0.5 percentile)

max_docfreq

maximum values of a feature's document frequency, above which features will be removed (default 99 percentile)

Details

the function is substantially a wrapper of functions available in quanteda and factoextra. More specifically it leverages the correspondence analysis function CA. The fitted ca_model can be used to further explore the model and create plots with fviz_ca_biplot

Value

a correspondence analysis model

Examples

1
2
3
## Not run: 
dataframe$cluster <- lca(df, docid_field = "documents", text_field = "texts")
## End(Not run)

nicolarighetti/textools documentation built on Oct. 16, 2021, 11:20 p.m.