lexicon: Lexicon

Description Usage Arguments Examples

View source: R/corpus.R

Description

The lexicon of a corpus consists of all the terms that occur in any document in the corpus. The lexical frequency of a term tells us how often a term occurs across all of the documents. Often the most interesting words in a document are those words whose frequency within a document is higher than their frequency in the corpus as a whole.

Usage

1
2
3
4
5
6
7
8
9
lexicon(corpus)

update_lexicon(corpus)

## S3 method for class 'corpus'
lexicon(corpus)

## S3 method for class 'corpus'
update_lexicon(corpus)

Arguments

corpus

A corpus, as returned vy corpus.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
## Not run: 
init_textanalysis()

# build document
doc1 <- string_document("First document.")
doc2 <- string_document("Second document.")

# do not automatically update
corpus <- corpus(doc1, doc2, update_lexicon = FALSE)

update_lexicon(corpus)
lexicon(corpus)

## End(Not run)

news-r/textanalysis documentation built on Nov. 4, 2019, 9:40 p.m.