flag.exact: Flag the documents that exactly match a pre-specified list of...
In kshirley/LDAtools: Tools to fit a topic model using Latent Dirichlet Allocation (LDA)

Description Usage Arguments Value See Also Examples

If there are certain (typically very short) documents that occur frequently in your data, and you wish to remove them from the data before you fit the LDA model, this function can be used to flag those documents. It's a trivial operation, but it's a useful reminder that users should visually inspect their data before running LDA (so as to throw out documents that don't require topic modeling in the first place).

1	flag.exact(data, exact, verbose = FALSE, quiet = FALSE)

`data`	a character vector containing the raw corpus. Each element should correspond to a 'document'.
`exact`	a character vector in which each element is a string, phrase, or longer snippet of text that you wish to discard, if the element matches the entire content of a document.
`verbose`	logical. Track the categories of exact matches. For instance, if a document exactly matches the third element of `exact`, then the corresponding value returned will be 3.
`quiet`	logical. Should a summary of the preprocessing steps be printed to the screen?

category an integer vector of the same length as data, where, if verbose=TRUE, 0 indicates that the document did not match any of the strings in exact, and an integer j = 1, ..., K indicates that a document was an exact match to the jth element of exact, and if verbose=FALSE, an indicator vector of whether the document exactly matched any of the elements of exact (without indicating which element it matched).

flag.partial

data <- c("bla bla bla", "foo", "bar", "text")
match.exact <- c("foo", "junk")
flag.exact(data, match.exact, verbose=FALSE, quiet=FALSE) # c(0, 1, 0, 0)
flag.exact(data, match.exact, verbose=TRUE, quiet=FALSE) # c(0, 2, 0, 0)

kshirley/LDAtools documentation built on May 20, 2019, 7:03 p.m.

kshirley/LDAtools index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

kshirley/LDAtools
Tools to fit a topic model using Latent Dirichlet Allocation (LDA)

flag.exact: Flag the documents that exactly match a pre-specified list of...
In kshirley/LDAtools: Tools to fit a topic model using Latent Dirichlet Allocation (LDA)

Description

Usage

Arguments

Value

See Also

Examples

Related to flag.exact in kshirley/LDAtools...

R Package Documentation

Browse R Packages

We want your feedback!

kshirley/LDAtools Tools to fit a topic model using Latent Dirichlet Allocation (LDA)

flag.exact: Flag the documents that exactly match a pre-specified list of... In kshirley/LDAtools: Tools to fit a topic model using Latent Dirichlet Allocation (LDA)

Description

Usage

Arguments

Value

See Also

Examples

Related to flag.exact in kshirley/LDAtools...

R Package Documentation

Browse R Packages

We want your feedback!

kshirley/LDAtools
Tools to fit a topic model using Latent Dirichlet Allocation (LDA)

flag.exact: Flag the documents that exactly match a pre-specified list of...
In kshirley/LDAtools: Tools to fit a topic model using Latent Dirichlet Allocation (LDA)