instances_Matrix: Extract term-document matrix from instances
In agoldst/dfrtopics: Tools for exploring topic models of text

instances_Matrix

R Documentation

Extract term-document matrix from instances

Description

Given an instance list, returns a term-document matrix (sparse format).

Usage

instances_Matrix(instances, verbose = getOption("dfrtopics.verbose"))

Arguments

`instances`	file holding MALLET instances or rJava reference to a MALLET `InstanceList` object from e.g. `read_instances`
`verbose`	if TRUE, give some progress messaging

Details

If the matrix is m, then m[i, j] gives the weight of word i in document j. If another term-weighting is desired, this matrix is convenient to operate on.

For the idea of going sparse, h/t Ben Marwick. The conversion is fairly slow because it involves copying all the corpus data from Java to R and then goes on to commit the Ultimate Sin and use a for loop. Pass verbose=T for some reports on progress. TODO: make smarter.

Value

a sparseMatrix with documents in columns and words in rows. The ordering of the words is as in the vocabulary (instances_vocabulary), and the ordering of documents is as in the instance list (instances_ids).

agoldst/dfrtopics
Tools for exploring topic models of text

instances_Matrix: Extract term-document matrix from instances
In agoldst/dfrtopics: Tools for exploring topic models of text

Extract term-document matrix from instances

Description

Usage

Arguments

Details

Value

See Also

Related to instances_Matrix in agoldst/dfrtopics...

R Package Documentation

Browse R Packages

We want your feedback!

agoldst/dfrtopics Tools for exploring topic models of text

instances_Matrix: Extract term-document matrix from instances In agoldst/dfrtopics: Tools for exploring topic models of text

Extract term-document matrix from instances

Description

Usage

Arguments

Details

Value

See Also

Related to instances_Matrix in agoldst/dfrtopics...

R Package Documentation

Browse R Packages

We want your feedback!

agoldst/dfrtopics
Tools for exploring topic models of text

instances_Matrix: Extract term-document matrix from instances
In agoldst/dfrtopics: Tools for exploring topic models of text