get_bottom_terms: Get List of Least Frequent Terms
In avkoehl/textprocessingDSI: Clean an arbitrarily large corpus for topic modelling over many cores

Description Usage Arguments Value Examples

Similar to get_sparse but looks at word frequency not doc count. If X is whole number, returns the X least frequent terms. If X is decimal returns the X

1	get_bottom_terms(wf, nterms, count)

`wf`	A data table containing the word and document frequencies accross the corpus.
`nterms`	A number specifying the total number of unique words in the corpus.
`count`	A number either decimal or whole; interpreted as percent, whole as count.

words A character vector of the least frequent terms

## Not run: 
infreq = get_bottom_terms(wf, 100000, 5000) #returns 5000 least common terms
infreq = get_bottom_terms(wf, 100000, .05) #returns the bottom 5% of terms

## End(Not run)

avkoehl/textprocessingDSI documentation built on June 5, 2019, 7:41 p.m.

avkoehl/textprocessingDSI index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

avkoehl/textprocessingDSI
Clean an arbitrarily large corpus for topic modelling over many cores

get_bottom_terms: Get List of Least Frequent Terms
In avkoehl/textprocessingDSI: Clean an arbitrarily large corpus for topic modelling over many cores

Description

Usage

Arguments

Value

Examples

Related to get_bottom_terms in avkoehl/textprocessingDSI...

R Package Documentation

Browse R Packages

We want your feedback!

avkoehl/textprocessingDSI Clean an arbitrarily large corpus for topic modelling over many cores

get_bottom_terms: Get List of Least Frequent Terms In avkoehl/textprocessingDSI: Clean an arbitrarily large corpus for topic modelling over many cores

Description

Usage

Arguments

Value

Examples

Related to get_bottom_terms in avkoehl/textprocessingDSI...

R Package Documentation

Browse R Packages

We want your feedback!

avkoehl/textprocessingDSI
Clean an arbitrarily large corpus for topic modelling over many cores

get_bottom_terms: Get List of Least Frequent Terms
In avkoehl/textprocessingDSI: Clean an arbitrarily large corpus for topic modelling over many cores