preprocess: preProcess

Description Usage Arguments

Description

This is a preprocessing function which can perform basic proprocessing rules.

Usage

1
2
preProcess(text, english = F, whitespace = F, stopwords = F,
  number = F, punc = F, stem = F, lower = F)

Arguments

text

text corpus want to analyze.

english

if TRUE then alphabets except English will be deleted from the courpus.

whitespace

if TRUE then extra whitespaces will be deleted from the courpus.

stopwords

if TRUE then only English stopwords will be deleted. The stopwords are from tm package.

number

if TRUE then numbers will be deleted from the corpus

punc

if TRUE then punctuations will be deleted from the courpus.

stem

if TRUE then terms in the corpus will be stemmed. The stemming logic is from tm package.

lower

if TRUE then English characters will be lowered. If other languages except English are included the function will not work.


ABMI/SOCRATex documentation built on March 20, 2021, 11:01 a.m.