frequentwords: Frequent words

frequentwordsR Documentation

Frequent words

Description

Most frequent words of the corpus.

Usage

frequentwords(
  corpus,
  nb,
  mincount = 5,
  minphrasecount = NULL,
  ngram = 1,
  lang = "en",
  stopwords = lang
)

Arguments

corpus

The corpus of documents (a vector of characters) or the vocabulary of the documents (result of function getvocab).

nb

The number of words to be returned.

mincount

Minimum word count to be considered as frequent.

minphrasecount

Minimum collocation of words count to be considered as frequent.

ngram

maximum size of n-grams.

lang

The language of the documents (NULL if no stemming).

stopwords

Stopwords, or the language of the documents. NULL if stop words should not be removed.

Value

The most frequent words of the corpus.

See Also

getvocab

Examples

## Not run: 
text = loadtext ("http://mattmahoney.net/dc/text8.zip")
frequentwords (text, 100)
vocab = getvocab (text)
frequentwords (vocab, 100)

## End(Not run)

fdm2id documentation built on July 9, 2023, 6:05 p.m.

Related to frequentwords in fdm2id...