TermDocumentMatrix | R Documentation |
Constructs or coerces to a term-document matrix or a document-term matrix.
TermDocumentMatrix(x, control = list())
DocumentTermMatrix(x, control = list())
as.TermDocumentMatrix(x, ...)
as.DocumentTermMatrix(x, ...)
x |
for the constructors, a corpus or an R object from which a
corpus can be generated via |
control |
a named list of control options. There are local
options which are evaluated for each document and global options
which are evaluated once for the constructed matrix. Available local
options are documented in This is different for a Available global options are:
|
... |
the additional argument |
An object of class TermDocumentMatrix
or class
DocumentTermMatrix
(both inheriting from a
simple triplet matrix in package slam)
containing a sparse term-document matrix or document-term matrix. The
attribute weighting
contains the weighting applied to the
matrix.
termFreq
for available local control options.
data("crude")
tdm <- TermDocumentMatrix(crude,
control = list(removePunctuation = TRUE,
stopwords = TRUE))
dtm <- DocumentTermMatrix(crude,
control = list(weighting =
function(x)
weightTfIdf(x, normalize =
FALSE),
stopwords = TRUE))
inspect(tdm[202:205, 1:5])
inspect(tdm[c("price", "prices", "texas"), c("127", "144", "191", "194")])
inspect(dtm[1:5, 273:276])
if(requireNamespace("SnowballC")) {
s <- SimpleCorpus(VectorSource(unlist(lapply(crude, as.character))))
m <- TermDocumentMatrix(s,
control = list(removeNumbers = TRUE,
stopwords = TRUE,
stemming = TRUE))
inspect(m[c("price", "texa"), c("127", "144", "191", "194")])
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.