createDTM: Create a Chinese term-document matrix or a document-term...

Description Usage Arguments Details Value Author(s)

View source: R/createDTM.R

Description

Create a Chinese term-document matrix or a document-term matrix.

Usage

1
2
3
4
createDTM(string, language = c("zh", "en"), tokenize = NULL, removePunctuation = TRUE, 
  removeNumbers = TRUE, removeStopwords = TRUE)
createTDM(string, language = c("zh", "en"), tokenize = NULL, removePunctuation = TRUE, 
  removeNumbers = TRUE, removeStopwords = TRUE)

Arguments

string

A character vector.

language

The language type, 'zh' means Chinese.

tokenize

A tokenizers function.

removePunctuation

Whether to remove the punctuations.

removeNumbers

Whether to remove the numbers.

removeStopwords

Whether to remove the stop words.

Details

Package "tm" is required.

Value

An object of class TermDocumentMatrix or class DocumentTermMatrix.

Author(s)

Jian Li <[email protected]>


tmcn documentation built on March 18, 2018, 1:44 p.m.