DTM.aggr_synonyms: Aggregate redundant terms

View source: R/NLP.R

DTM.aggr_synonymsR Documentation

Aggregate redundant terms

Description

Build a term-term network using a cosine similarity measure built on the term co-presence in documents. A high threshold defined in min.sim is used to identify edges. The high edge threshold splits the network into multiple components which identify redundant terms.

Usage

DTM.aggr_synonyms(DTM, min.sim = 0.9)

Arguments

DTM

A Document Term Matrix.

min.sim

The minimal cosine similarity that identifies an edge.

Value

The same input Document Term Matrix with redundant terms removed and joined into new columns.


bakaburg1/BaySREn documentation built on March 30, 2022, 12:16 a.m.