instance_vector: Convert a MALLET Instance to an integer vector

instance_vectorR Documentation

Convert a MALLET Instance to an integer vector

Description

Given a single MALLET Instance (not an InstanceList), this function retrieves an R vector representation of the FeatureSequence.

Usage

instance_vector(instance)

Arguments

instance

a reference to a single Instance

Details

A FeatureSequence is a list of zero-based indices into the vocabulary. For convenience, this function adds 1 so that the result can be used to index directly into a vocabulary vector (e.g. from trainer$getVocabulary() or instances_vocabulary). Note that although MALLET's topic-modeling works on feature sequences because it is designed to preserve the order of words in the documents it models, if you have used pre-aggregated data from JSTOR the "sequences" will be meaningless.

Value

an integer vector, with one-based indices into the vocabulary

See Also

instances_vocabulary, wordcounts_texts, instance_text


agoldst/dfrtopics documentation built on July 15, 2022, 4:13 p.m.