genDefaultSettings | R Documentation |
A list of default settings for a TextMiner object:
genDefaultSettings(remove_punctuation = TRUE, remove_numbers = TRUE, tolower = TRUE, metric = "spherical", stemming = TRUE, remove_special_characters = TRUE, plain_text = TRUE, unique = TRUE, tm_package = "text2vec", weighting = "freq", wc_max_words = 50, wc_rot_per = 0.4, stop_words = c(letters, LETTERS, tm::stopwords("english")), wc_color = "blue", num_clust = 3, wc_gradient = "weight", dictionary = data.frame(), plot_color = "blue", sparsity = 0.999)
remove_punctuation
a single logical: Should punctuations be removed from all text documents? (default is TRUE)
remove_numbers
a single logical: Should numbers be removed from all text documents? (default is TRUE)
tolower
a single logical: should all letters be converted to lower case? (default is TRUE)
stemming
a single logical: should all words be reduced to their stem? (default is FALSE)
remove_special_characters
logical: should all special characters be removed? (default is TRUE)
plain_text
a single logical: should all the documents be treated as plain text? (default is TRUE)
unique
a single logical: should duplicated documents be removed? (default is TRUE)
weighting
a single character: specifies the default weighting. Must be within c('freq', 'tfidf')
. (default is 'tfidf'
)
metric
a single character: specifies the default metric for computing distances between the documents.
Must be within c("euclidean", "maximum", "manhattan", "canberra", "binary" , "minkowski", "spherical")
.
(default is 'spherical'
)
wc_max_words
a single integer: specifies the maximum number of words shown in the word cloud.
wc_rot_per
a single numeric: must be between 0 and 1. Specifies the percentage of words shown as rotated in the word cloud.
wc_color
a single character: specifies the color of the words shown in the word cloud.
wc_gradient
a single character: which weighting should be reflected by the color gradient in the word cloud.
Must be within c('freq', 'tfidf')
wc_color
a single character: specifies the color of the points in the point 2d and 3d plots.
num_clust
a single integer: specifies the default number of clusters. (default is 3)
sparsity
a single numeric: must be between 0 and 1 and specifies the sparsity. For example, if sparcity is 0.98, all words appearing in less than 2% of the documents will be removed. (default is 0.99)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.