subset_corpus: subset_corpus

Description Usage Arguments Value Examples

View source: R/corpus.R

Description

Select documents containing (or not containing) one or more terms.

Usage

1
subset_corpus(corpus, dtm, terms, exclude = FALSE, all = FALSE)

Arguments

corpus

A Corpus object.

dtm

A DocumentTermMatrix object corresponding to corpus.

terms

One of more terms appearing in dtm.

exclude

Whether documents containing the terms should be excluded rather than retained.

all

Whether only documents containing all terms should be retained or excluded. By default, documents need to contain at least one of the terms.

Value

Corpus object.

Examples

1
2
3
4
5
6
7
file <- system.file("texts", "reut21578-factiva.xml", package="tm.plugin.factiva")
corpus <- import_corpus(file, "factiva", language="en")
dtm <- build_dtm(corpus)
subset_corpus(corpus, dtm, "barrel")
subset_corpus(corpus, dtm, c("barrel", "opec"))
subset_corpus(corpus, dtm, c("barrel", "opec"), exclude=TRUE)
subset_corpus(corpus, dtm, c("barrel", "opec"), all=TRUE)

R.temis documentation built on May 13, 2021, 1:08 a.m.