Description Usage Arguments Value Examples
A function to create a document-term matrix from a corpus It is a convenience function based around tidytext::cast_dtm The format of the returned data frame is intended to be suitable as input to machine learning tasks
1 | create_dtm(corpus, filterwords, stop = TRUE, doc_title = "title")
|
corpus |
A data frame containing columns for title and text |
filterwords |
A data frame containing words to filter on |
stop |
A boolean denoting whether to use filterwords as top words |
doc_title |
The column name containing document title |
A data frame of a document term matrix.
1 2 3 4 5 6 7 8 9 10 | ## Not run:
library(tidytext)
books <- data.frame(title = c("Book A", "Book B", "Book C"),
text = c("Once upon a time", "A long time ago",
"In a land far away"),
stringsAsFactors=FALSE)
book_dtm <- create_dtm(books)
book_dtm_1 <- create_dtm(books,filterwords=stop_words)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.