Description Usage Arguments Examples
View source: R/formatProcureText.R
Formats procurement text into a term document matrix
1 | formatProcureText(procure, text.var)
|
procure |
|
text.var |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | ##---- Should be DIRECTLY executable !! ----
##-- ==> Define data, use random,
##-- or do help(data=index) for the standard data sets.
## The function is currently defined as
function (procure, text.var)
{
TrigramTokenizer <- function(x) NGramTokenizer(x, Weka_control(min = 1,
max = 2))
text <- mapply(paste, procure[, text.var], collapse = " ")
text <- stripWhitespace(text)
text <- removePunctuation(text)
text <- tolower(text)
text <- Corpus(VectorSource(text))
text <- tm_map(text, removeWords, c("the", stopwords("english")))
text <- tm_map(text, removeNumbers)
dtm <- DocumentTermMatrix(text, control = list(weighting = weightTf,
tokenize = TrigramTokenizer))
return(dtm)
}
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.