Description Usage Arguments Value Note Examples
Turns pluralizations of words in the columns of a document term matrix to their singular form. Then aggregates all columns that now have the same token. See example below.
1 | DepluralizeDtm(dtm, ...)
|
dtm |
A document term matrix of class |
... |
Other arguments to pass to |
Returns a document term matrix of class dgCMatrix
. The columns index
the de-pluralized tokens of the input document term matrix. In other words,
there will generally be fewer columns in the returned matrix than the
input matrix
This function performs parallel computation by default. The default
behavior is to use all available cores according to detectCores
.
However, this can be modified by passing the cpus
argument when calling
this function.
1 2 3 4 5 6 7 8 9 10 11 12 | ## Not run:
myvec <- c("the quick brown fox eats chickens",
"the slow gray fox eats the slow chicken",
"look at my horse", "my horses are amazing")
names(myvec) <- paste("doc", 1:length(myvec), sep="_")
dtm <- Vec2Dtm(vec = myvec, min.n.gram = 1, max.n.gram = 1)
dtm_new <- DepluralizeDtm(dtm = dtm)
#'
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.