wordcounts_instances: Create MALLET instances from a word-counts data frame
In agoldst/dfrtopics: Tools for exploring topic models of text

wordcounts_instances

R Documentation

Create MALLET instances from a word-counts data frame

Description

Given a data frame representing documents as feature counts, create a MALLET InstanceList object which can then be passed on to train_model or saved to disk for later use with write_instances. This function is a small convenience wrapper for make_instances that ensures no further stopword removal, tokenization, or casefolding is done.

Usage

wordcounts_instances(
  counts,
  shuffle = FALSE,
  sep = " ",
  token_regex = "\\S+",
  preserve_case = TRUE
)

Arguments

`counts`	data frame with `id`, `word`, `weight` columns
`shuffle`	randomize word order before passing on to MALLET? (See `wordcounts_texts`
`sep`	separator to use between words
`token_regex`	regular expression matching a token. Ordinarily, this should correspond to `sep` (hence the default, whitespace tokenization), since no further tokenization should be done.
`preserve_case`	if FALSE, all words are lowercased by MALLET

Details

If your tokens themselves contain whitespace, change the sep parameter and adjust the token_regex accordingly.

Value

an rJava reference to a MALLET InstanceList

agoldst/dfrtopics
Tools for exploring topic models of text

wordcounts_instances: Create MALLET instances from a word-counts data frame
In agoldst/dfrtopics: Tools for exploring topic models of text

Create MALLET instances from a word-counts data frame

Description

Usage

Arguments

Details

Value

See Also

Related to wordcounts_instances in agoldst/dfrtopics...

R Package Documentation

Browse R Packages

We want your feedback!

agoldst/dfrtopics Tools for exploring topic models of text

wordcounts_instances: Create MALLET instances from a word-counts data frame In agoldst/dfrtopics: Tools for exploring topic models of text

Create MALLET instances from a word-counts data frame

Description

Usage

Arguments

Details

Value

See Also

Related to wordcounts_instances in agoldst/dfrtopics...

R Package Documentation

Browse R Packages

We want your feedback!

agoldst/dfrtopics
Tools for exploring topic models of text

wordcounts_instances: Create MALLET instances from a word-counts data frame
In agoldst/dfrtopics: Tools for exploring topic models of text