prepare.text: Prepare Text and Calculate Term Frequencies

Description Usage Arguments Value Author(s)

Description

Perform a number of data preparation steps on a text corpus and calculate term frequencies.

Usage

1
2
prepare.text(forest, terms.from = c("subjects", "content"),list=c("devel", "help"), protect = NULL, ae.to.be = TRUE,
                 replace = TRUE, stem = TRUE)

Arguments

forest

A matrix with five columns. Result of makeforest.

terms.from

Should the function be applied on subjects or content?

list

Should the function be applied on the help or devel mailing list?

ae.to.be

Logical. Should American English spelling be transformed to British English spelling? Defaults to TRUE.

replace

Logical. Should terms be replaced by synonyms found in the text. See also ?wn.replace

protect

A numerical vector indicating the index of terms that should not be replaced by synonyms.

stem

Logical. Should terms be stemmed using the tm Snowball stemmer? Defaults to TRUE.

Value

$termfreq

A named vector containing the term frequencies of terms found in forest after a number of preparation steps.

$forest

Returns forest after preparation.

Author(s)

Angela Bohn angela.bohn at gmail.com


snatm documentation built on May 2, 2019, 5:01 p.m.

Related to prepare.text in snatm...