clean_corpus: Clean corpus

Description Usage Arguments Value

Description

Removes whitespace, removes punctuation, transforms characters to lowercase, and removes stopwords (with the option to add additional stopwords). Used in get_freq_terms.

Usage

1
clean_corpus(corpus, stopwords = NULL)

Arguments

corpus

corpus to be cleaned. To create the corpus use: source <- VectorSource(vec), corpus <- VCorpus(source)

stopwords

optional, adds stopwords to remove. If not specified it will only remove English stopwords from the tm package.

Value

corpus


loshita/oshitar documentation built on May 8, 2019, 11:12 p.m.