Description Usage Arguments Value References Examples
View source: R/text_to_sparse_matrix.R
This code takes an existing sparse text matrix, and a character vector of new data. It then tokenizes the new data, spellchecks it (todo), applies n-grams, and uses the bag of words from the input matrix to contruct a new matrix with the exact same columns. It then optionally applies the input matrix's tf-idf weightings to the new matrix and returns a new textVectors object. TODO: COMBINE THIS WITH THE MATRIX FUNCTION!!!
1 2 |
object |
object of class textVectors |
newdata |
new data to apply the same matrix format to |
verbose |
print diagnostic messages |
... |
ignored |
a textVectors object
http://stackoverflow.com/questions/4942361/how-to-turn-a-list-of-lists-to-a-sparse-matrix-in-r-without-using-lapply http://en.wikipedia.org/wiki/Tf-idf
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | x <- c(
'i like this package written by zach mayer',
'this package is so much fun',
'thanks zach for writing it',
'this package is the best package',
'i want to give zach mayer a million dollars')
y <- 'this is a new sentence about how much i like this package written
by zach mayer it is a cool package and he is a cool dude'
a <- textVectors(
x,
absCutoff=1, ngrams=2, stem=TRUE, verbose=TRUE)
b <- textVectors(
x,
absCutoff=1, ngrams=2, stem=TRUE, verbose=TRUE, tfidf=TRUE)
predict(a, y)
predict(b, y)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.