LexCA: Correspondence Analysis of a Lexical Table from a TextData...
In Xplortext: Statistical Analysis of Textual Data

LexCA

R Documentation

Correspondence Analysis of a Lexical Table from a TextData object (LexCA)

Description

Performs Correspondence Analysis on the working lexical table contained in TextData object. Supplementary documents, words, segments, contextual quantitative and qualitative variables can be considered if previously selected in TextData function.

Usage

LexCA(object, ncp=5, context.sup="ALL", doc.sup=NULL, word.sup=NULL, 
  segment=FALSE, graph=TRUE, axes=c(1, 2), lmd=3, lmw=3)

Arguments

`object`	object of TextData class
`ncp`	number of dimensions kept in the results (by default 5)
`context.sup`	column index(es) or name(s) of the contextual qualitative or quantitative variables among those selected in TextData function (by default "ALL")
`doc.sup`	vector indicating the index(es) or name(s) of the supplementary documents (rows) (by default NULL)
`word.sup`	vector indicating the index(es) or name(s) of the supplementary words (columns) (by default NULL)
`segment`	if TRUE, the repeated segments identified by TextData function will be considered as supplementary columns (by default FALSE)
`graph`	if TRUE, basic graphs are displayed; use plot.LexCA to obtain more graphs (by default TRUE)
`axes`	length-2 vector indicating the axes to plot (by default axes=c(1,2))
`lmd`	only the documents whose contribution is over lmd times the average-document-contribution are plotted (by default lmd=3)
`lmw`	only the words whose contribution is over lmw times the average-word-contribution are plotted (by default lmw=3)

Details

In the case of a direct CA, DocTerm is a non-aggregate table and:

the contextual quantitative variables are considered as supplementary quantitative columns in CA.
the categories of the contextual qualitative variables are considered as supplementary columns in CA.

In the case of an aggregate CA, DocTerm is an aggregate table and:

the contextual quantitative variables are considered as supplementary quantitative columns in CA; the value of an active aggregate-document for a variable is the mean of the values corresponding to the source-documents belonging to this aggregate-document.
the categories of the contextual qualitative variables are threatened as supplementary rows in CA; these rows contain the frequency with which each the set of documents belonging to this category has used the different words.

Value

Returns a list including:

`eig`	matrix with the eigenvalues, the percentages of inertia and the cumulative percentages of inertia
`row`	list of matrices with all the results for the documents (coordinates, square cosines, contributions, inertia)
`col`	list of matrices with all the results for the words (coordinates, square cosines, contributions, inertia)
`row.sup`	if row.sup is non-NULL, list of matrices with all the results for the supplementary documents (coordinates, square cosines)
`col.sup`	if col.sup is non-NULL, list of matrices with all the results for the supplementary words (coordinates, square cosines)
`quanti.sup`	if quanti.sup is non-NULL, list of matrices containing the results for the supplementary quantitative variables (coordinates, square cosines)
`quali.sup`	if quali.sup is non-NULL, list of matrices with all the results for the supplementary categorical variables; see section details
`meta`	list of the documents/words whose contribution is over lmd/lmw times the average document/word contribution
`VCr`	Cramer's V coefficient
`Inertia`	total inertia
`info`	information about the corpus
`segment`	if segment is TRUE, list of matrices with the results for the repeated segments (coordinates, square cosines)
`var.agg`	name of the aggregation variable in the case of an aggregate correspondence analysis
`call`	a list with some statistics

Author(s)

Ramón Alvarez-Esteban ramon.alvarez@unileon.es, Mónica Bécue-Bertaut, Josep-Anton Sánchez-Espigares

References

Benzécri, J, P. (1981). Pratique de l'analyse des donnees. Linguistique & lexicologie (Vol.3). (P. Dunod., Ed).

Husson F., Lê S., Pagès J. (2011). Exploratory Multivariate Analysis by Example Using R. Chapman & Hall/CRC. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1201/b10345")}.

Lebart, L., Salem, A., & Berry, L. (1998). Exploring textual data. (D. Kluwer, Ed.). \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1007/978-94-017-1525-6")}.

Murtagh F. (2005). Correspondence Analysis and Data Coding with R and Java. Chapman & Hall/CRC.

Examples

data(open.question)
## Not run: 
### non-aggregate CA
res.TD<-TextData(open.question, var.text=c(9,10), Fmin=10, Dmin=10,
        remov.number=TRUE, stop.word.tm=TRUE)
res.LexCA<-LexCA(res.TD, lmd=0, lmw=1)

## End(Not run)

### aggregate CA
res.TD<-TextData(open.question, var.text=c(9,10), var.agg="Age_Group", Fmin=10, Dmin=10,
        remov.number=TRUE, stop.word.tm=TRUE)
res.LexCA<-LexCA(res.TD, lmd=0, lmw=1)

Xplortext documentation built on April 12, 2025, 2:13 a.m.