toDocumentTermMatrix: Default preprocessing of corpus and conversion to...
In SentimentAnalysis: Dictionary-Based Sentiment Analysis

toDocumentTermMatrix

R Documentation

Default preprocessing of corpus and conversion to document-term matrix

Description

Preprocess existing corpus of type Corpus according to default operations. This helper function groups all standard preprocessing steps such that the usage of the package is more convenient. The result is a document-term matrix.

Usage

toDocumentTermMatrix(
  x,
  language = "english",
  minWordLength = 3,
  sparsity = NULL,
  removeStopwords = TRUE,
  stemming = TRUE,
  weighting = function(x) tm::weightTfIdf(x, normalize = FALSE)
)

Arguments

`x`	`Corpus` object which should be processed
`language`	Default language used for preprocessing (i.e. stop word removal and stemming)
`minWordLength`	Minimum length of words used for cut-off; i.e. shorter words are removed. Default is 3.
`sparsity`	A numeric for the maximal allowed sparsity in the range from bigger zero to smaller one. Default is `NULL` in order suppress this functionality.
`removeStopwords`	Flag indicating whether to remove stopwords or not (default: yes)
`stemming`	Perform stemming (default: TRUE)
`weighting`	Function used for weighting of words; default is a a link to the tf-idf scheme.

Value

Object of DocumentTermMatrix

SentimentAnalysis
Dictionary-Based Sentiment Analysis

toDocumentTermMatrix: Default preprocessing of corpus and conversion to...
In SentimentAnalysis: Dictionary-Based Sentiment Analysis

Default preprocessing of corpus and conversion to document-term matrix

Description

Usage

Arguments

Value

See Also

Related to toDocumentTermMatrix in SentimentAnalysis...

R Package Documentation

Browse R Packages

We want your feedback!

SentimentAnalysis Dictionary-Based Sentiment Analysis

toDocumentTermMatrix: Default preprocessing of corpus and conversion to... In SentimentAnalysis: Dictionary-Based Sentiment Analysis

Default preprocessing of corpus and conversion to document-term matrix

Description

Usage

Arguments

Value

See Also

Related to toDocumentTermMatrix in SentimentAnalysis...

R Package Documentation

Browse R Packages

We want your feedback!

SentimentAnalysis
Dictionary-Based Sentiment Analysis

toDocumentTermMatrix: Default preprocessing of corpus and conversion to...
In SentimentAnalysis: Dictionary-Based Sentiment Analysis