remove_infrequent_terms: Remove infrequently occurring terms from quanteda dfm.
In matthewjdenny/preptest: Diagnostics to Assess the Effects of Text Preprocessing Decisions

Description Usage Arguments Value Examples

View source: R/remove_infrequent_terms.R

Removes terms appearing in less than a specific proportion of documents in a corpus from a dfm.

remove_infrequent_terms(
  dfm_object,
  proportion_threshold = 0.01,
  indices = NULL,
  verbose = TRUE
)

`dfm_object`	A quanteda dfm object.
`proportion_threshold`	proportion of documents a term must be included in to be included in the dfm.
`indices`	Defaults to NULL. If not NULL, then it must be a numeric vector specifying the column indices of terms the user would like to remove. Useful for removing specific terms.
`verbose`	Logical indicating whether more information should be printed to the screen to let the user know about progress in preprocessing. Defaults to TRUE.

A reduced dfm.

## Not run: 
# load the package
library(preText)
# load in the data
data("UK_Manifestos")
# preprocess data
preprocessed_documents <- factorial_preprocessing(
    UK_Manifestos,
    use_ngrams = TRUE,
    infrequent_term_threshold = 0.02,
    verbose = TRUE)
updated_dfm <- remove_infrequent_terms(preprocessed_documents$dfm_list[[1]],
                                       proportion_threshold = 0.5,
                                       indices = NULL,
                                       verbose = TRUE)

## End(Not run)

matthewjdenny/preptest documentation built on July 27, 2021, 1:19 a.m.

matthewjdenny/preptest index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

matthewjdenny/preptest
Diagnostics to Assess the Effects of Text Preprocessing Decisions

remove_infrequent_terms: Remove infrequently occurring terms from quanteda dfm.
In matthewjdenny/preptest: Diagnostics to Assess the Effects of Text Preprocessing Decisions

Description

Usage

Arguments

Value

Examples

Related to remove_infrequent_terms in matthewjdenny/preptest...

R Package Documentation

Browse R Packages

We want your feedback!

matthewjdenny/preptest Diagnostics to Assess the Effects of Text Preprocessing Decisions

remove_infrequent_terms: Remove infrequently occurring terms from quanteda dfm. In matthewjdenny/preptest: Diagnostics to Assess the Effects of Text Preprocessing Decisions

Description

Usage

Arguments

Value

Examples

Related to remove_infrequent_terms in matthewjdenny/preptest...

R Package Documentation

Browse R Packages

We want your feedback!

matthewjdenny/preptest
Diagnostics to Assess the Effects of Text Preprocessing Decisions

remove_infrequent_terms: Remove infrequently occurring terms from quanteda dfm.
In matthewjdenny/preptest: Diagnostics to Assess the Effects of Text Preprocessing Decisions