textmatch: Toolkit for Matching Textual Data and Evaluating Textual Similarity

transform_dfm

R Documentation

Applies bounds, weights, and/or coarsening schemes to a dfm or document frequency matrix to reduce the dimension of the data, reduce noise, or apply other design rules (e.g. - to exclude words that occur in too few or too many documents).

Description

Applies bounds, weights, and/or coarsening schemes to a dfm or document frequency matrix to reduce the dimension of the data, reduce noise, or apply other design rules (e.g. - to exclude words that occur in too few or too many documents).

Usage

transform_dfm(x, bounds, tfidf = FALSE, verbose = TRUE)

Arguments

`x`	a matrix text representation with rows corresponding to each document in a corpus and columns that represent summary measures of the text (e.g., word counts, topic proportions, etc.). Acceptable forms include a valid quanteda `dfm` object, a tm Document-Term Matrix, or a matrix of estimated topic proportions.
`bounds`	a vector of lower and upper bounds to enforce. Defaults to excluding any terms that appear in only one document and any terms that appear in every document
`tfidf`	optional scheme to use for weighting the DTM. Defaults to `FALSE`.
`verbose`	indicator for verbosity

Value

A bounded DFM

reaganmozer/textmatch documentation built on March 7, 2024, 2:41 p.m.

reaganmozer/textmatch index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

reaganmozer/textmatch
Toolkit for Matching Textual Data and Evaluating Textual Similarity

transform_dfm: Applies bounds, weights, and/or coarsening schemes to a dfm...
In reaganmozer/textmatch: Toolkit for Matching Textual Data and Evaluating Textual Similarity

Applies bounds, weights, and/or coarsening schemes to a dfm or document frequency matrix to reduce the dimension of the data, reduce noise, or apply other design rules (e.g. - to exclude words that occur in too few or too many documents).

Description

Usage

Arguments

Value

Related to transform_dfm in reaganmozer/textmatch...

R Package Documentation

Browse R Packages

We want your feedback!

reaganmozer/textmatch Toolkit for Matching Textual Data and Evaluating Textual Similarity

transform_dfm: Applies bounds, weights, and/or coarsening schemes to a dfm... In reaganmozer/textmatch: Toolkit for Matching Textual Data and Evaluating Textual Similarity

Applies bounds, weights, and/or coarsening schemes to a dfm or document frequency matrix to reduce the dimension of the data, reduce noise, or apply other design rules (e.g. - to exclude words that occur in too few or too many documents).

Description

Usage

Arguments

Value

Related to transform_dfm in reaganmozer/textmatch...

R Package Documentation

Browse R Packages

We want your feedback!

reaganmozer/textmatch
Toolkit for Matching Textual Data and Evaluating Textual Similarity

transform_dfm: Applies bounds, weights, and/or coarsening schemes to a dfm...
In reaganmozer/textmatch: Toolkit for Matching Textual Data and Evaluating Textual Similarity