View source: R/dtmremovetfidf.R
dtmremovetfidf | R Documentation |
Remove terms from a Document-Term-Matrix and documents with no terms based on the term frequency inverse document frequency.
Either giving in the maximum number of terms (argument top
), the tfidf cutoff (argument cutoff
)
or a quantile (argument prob
)
dtmremovetfidf(dtm, top, cutoff, prob, remove_emptydocs = TRUE)
dtm |
an object class "dgCMatrix" |
top |
integer with the number of terms which should be kept as defined by the highest mean tfidf |
cutoff |
numeric cutoff value to keep only terms in |
prob |
numeric quantile indicating to keep only terms in |
remove_emptydocs |
logical indicating to remove documents containing no more terms after the term removal is executed. Defaults to |
a sparse Matrix as returned by sparseMatrix
where terms with high tfidf are kept and documents without any remaining terms are removed
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.