runCorpusCa: Correspondence analysis from a tm corpus
In RcmdrPlugin.temis: Graphical Integrated Text Mining Solution

runCorpusCa

R Documentation

Correspondence analysis from a tm corpus

Description

Compute a simple correspondence analysis on the document-term matrix of a tm corpus.

Usage

runCorpusCa(corpus, dtm = NULL, variables = NULL, sparsity = 0.9, ...)

Arguments

`corpus`	A tm corpus.
`dtm`	an optional document-term matrix to use; if missing, `DocumentTermMatrix` will be called on `corpus` to create it.
`variables`	a character vector giving the names of meta-data variables to aggregate the document-term matrix (see “Details” below).
`sparsity`	Optional sparsity threshold (between 0 and 1) below which terms should be skipped. See `removeSparseTerms` from tm.
`...`	Additional parameters passed to `ca`.

Details

The function runCorpusCa runs a correspondence analysis (CA) on the document-term matrix that can be extracted from a tm corpus by calling the DocumentTermMatrix function, or directly from the dtm object if present.

If no variable is passed via the variables argument, a CA is run on the full document-term matrix (possibly skipping sparse terms, see below). If one or more variables are chosen, the CA will be based on a stacked table whose rows correspond to the levels of the variables: each cell contains the sum of occurrences of a given term in all the documents of the level. Documents that contain a NA are skipped for this variable, but taken into account for the others, if any.

In all cases, variables that have not been selected are added as supplementary rows. If at least one variable is passed, documents are also supplementary rows, while they are active otherwise.

The sparsity argument is passed to removeSparseTerms to remove less significant terms from the document-term matrix. This is especially useful for big corpora, which matrices can grow very large, prompting ca to take up too much memory.

Value

A ca object as returned by the ca function.

RcmdrPlugin.temis
Graphical Integrated Text Mining Solution

runCorpusCa: Correspondence analysis from a tm corpus
In RcmdrPlugin.temis: Graphical Integrated Text Mining Solution

Correspondence analysis from a tm corpus

Description

Usage

Arguments

Details

Value

See Also

Related to runCorpusCa in RcmdrPlugin.temis...

R Package Documentation

Browse R Packages

We want your feedback!

RcmdrPlugin.temis Graphical Integrated Text Mining Solution

runCorpusCa: Correspondence analysis from a tm corpus In RcmdrPlugin.temis: Graphical Integrated Text Mining Solution

Correspondence analysis from a tm corpus

Description

Usage

Arguments

Details

Value

See Also

Related to runCorpusCa in RcmdrPlugin.temis...

R Package Documentation

Browse R Packages

We want your feedback!

RcmdrPlugin.temis
Graphical Integrated Text Mining Solution

runCorpusCa: Correspondence analysis from a tm corpus
In RcmdrPlugin.temis: Graphical Integrated Text Mining Solution