termFreq | R Documentation |
Generate a term frequency vector from a text document.
termFreq(doc, control = list())
doc |
An object inheriting from |
control |
A list of control options which override default settings. First, following two options are processed.
Next, a set of options which are sensitive to the order of
occurrence in the
Finally, following options are processed in the given order.
|
A table of class c("term_frequency", "integer")
with term frequencies
as values and tokens as names.
getTokenizers
data("crude")
termFreq(crude[[14]])
if(requireNamespace("SnowballC")) {
strsplit_space_tokenizer <- function(x)
unlist(strsplit(as.character(x), "[[:space:]]+"))
ctrl <- list(tokenize = strsplit_space_tokenizer,
removePunctuation =
list(preserve_intra_word_dashes = TRUE),
stopwords = c("reuter", "that"),
stemming = TRUE,
wordLengths = c(4, Inf))
termFreq(crude[[14]], control = ctrl)
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.