This is a preprocessing function which can perform basic proprocessing rules.
1 2 | preProcess(text, english = F, whitespace = F, stopwords = F,
number = F, punc = F, stem = F, lower = F)
|
text |
text corpus want to analyze. |
english |
if TRUE then alphabets except English will be deleted from the courpus. |
whitespace |
if TRUE then extra whitespaces will be deleted from the courpus. |
stopwords |
if TRUE then only English stopwords will be deleted. The stopwords are from tm package. |
number |
if TRUE then numbers will be deleted from the corpus |
punc |
if TRUE then punctuations will be deleted from the courpus. |
stem |
if TRUE then terms in the corpus will be stemmed. The stemming logic is from tm package. |
lower |
if TRUE then English characters will be lowered. If other languages except English are included the function will not work. |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.