View source: R/create_matrix.R
| create_matrix | R Documentation | 
DocumentTermMatrix from tm.create_matrix creates a document-term matrix
create_matrix( textColumns, language = "english", minDocFreq = 1, minWordLength = 3, removeNumbers = TRUE, removePunctuation = TRUE, removeSparseTerms = 0, removeStopwords = TRUE, stemWords = FALSE, stripWhitespace = TRUE, toLower = TRUE, weighting = weightTf )
| textColumns | Either character vector (e.g. data$Title) or a  | 
| language | The language to be used for stemming the text data. | 
| minDocFreq | The minimum number of times a word should appear in a document for it to be included in the matrix. See package tm for more details. | 
| minWordLength | The minimum number of letters a word should contain to be included in the matrix. See package tm for more details. | 
| removeNumbers | A  | 
| removePunctuation | A  | 
| removeSparseTerms | See package tm for more details. | 
| removeStopwords | A  | 
| stemWords | A  | 
| stripWhitespace | A  | 
| toLower | A  | 
| weighting | Either  | 
Timothy P. Jurka <tpjurka@ucdavis.edu>
# DEFINE THE DOCUMENTS
documents <- c("I am very happy, excited, and optimistic.",
"I am very scared, annoyed, and irritated.",
"Iraq's political crisis entered its second week one step closer to the potential
dissolution of the government, with a call for elections by a vital coalition partner
and a suicide attack that extended the spate of violence that has followed the withdrawal
of U.S. troops.",
"With nightfall approaching, Los Angeles authorities are urging residents to keep their
outdoor lights on as police and fire officials try to catch the person or people responsible
for nearly 40 arson fires in the last three days.")
matrix <- create_matrix(documents, language="english", removeNumbers=TRUE,
                        stemWords=FALSE, weighting=weightTfIdf)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.