Description Usage Arguments Details Author(s) Examples
Constructs Document-Term Matrix from Chinese Text Documents.
1 2 |
doc |
The Chinese text document. A vector of Chinese strings. |
weighting |
Available weighting function with matrix are binary, count, tf, tfidf. See details. |
EngTermDeleted |
remove English from text documents. |
NumTermDeleted |
remove Numbers from text documents. |
shortTermDeleted |
Deltected short word when nchar <2. |
This function run a Chinese word segmentation by jiebeR and build document-term matrix, and there is four weighting function with matrix, and "binary" means value can only be 1 if the term occurs, "count" means how many times the term occurs in a doc, "tf" means term frequency and "tfidf" means term frequency inverse document frequency.
Jim Liu, Quan Gu
1 2 3 4 5 6 7 |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.