cooc_terms: cooc_terms

Description Usage Arguments Details Value Examples

View source: R/terms.R

Description

Show terms that are the most associated (positively or negatively) with a reference term.

Usage

1
2
3
4
5
6
7
8
9
cooc_terms(
  dtm,
  term,
  variable = NULL,
  p = 0.1,
  n = 25,
  sparsity = 1,
  min_occ = 2
)

Arguments

dtm

A DocumentTermMatrix.

term

A reference term appearing in dtm.

variable

An optional vector of values giving the groups for which most frequent terms should be reported.

p

The maximum p-value up to which terms should be reported.

n

The maximal number of terms to report (for each group, if applicable).

sparsity

Value between 0 and 1 indicating the proportion of documents with no occurrences of a term above which that term should be dropped. By default all terms are kept (sparsity=1).

min_occ

The minimum number of occurrences in the whole dtm below which terms should be skipped.

Details

Co-occurrent terms are those which are specific to documents which contain the given term. The output is the same as that returned by specific_terms.

Value

A list of matrices, one for each level of the variable, with columns:

Examples

1
2
3
4
5
file <- system.file("texts", "reut21578-factiva.xml", package="tm.plugin.factiva")
corpus <- import_corpus(file, "factiva", language="en")
dtm <- build_dtm(corpus)
cooc_terms(dtm, "barrel")
cooc_terms(dtm, "barrel", meta(corpus)$Date)

R.temis documentation built on May 13, 2021, 1:08 a.m.

Related to cooc_terms in R.temis...