tmcn: A Text mining toolkit for international characters especially for Chinese.
Version 0.1-4

A Text mining toolkit for international characters especially for Chinese.

AuthorJian Li <rweibo@sina.com>
Date of publication2015-03-02 14:40:04
MaintainerJian Li <rweibo@sina.com>
LicenseLGPL
Version0.1-4
Package repositoryView on R-Forge
InstallationInstall the latest version of this package by entering the following in R:
install.packages("tmcn", repos="http://R-Forge.R-project.org")

Popular man pages

catUTF8: Print the UTF-8 codes of a string.
GBK: GBK character set
NTUSD: National Taiwan University Semantic Dictionary
SIMTRA: Dictionary of simplified and traditional Chinese
stopwordsCN: Return Chinese stop words.
strpad: Pad a string to a specified length with a padding character.
tmcnTest: Run unit tests.
See all...

All man pages Function index File listing

Man pages

catUTF8: Print the UTF-8 codes of a string.
createHashmapEnv: Create an environment for hash mapping.
GBK: GBK character set
getCharset: Get the current encoding of the locale.
getWordFreq: Get the word frequency data.frame.
isBIG5: Indicate whether the encoding of input string is BIG5.
isGB18030: Indicate whether the encoding of input string is GB18030.
isGB2312: Indicate whether the encoding of input string is GB2312.
isGBK: Indicate whether the encoding of input string is GBK.
isUTF8: Indicate whether the encoding of input string is UTF-8.
NTUSD: National Taiwan University Semantic Dictionary
revUTF8: Revert UTF-8 string to Chinese character.
setchs: Set locale to Simplified Chinese.
setcht: Set locale to Simplified Chinese.
SIMTRA: Dictionary of simplified and traditional Chinese
stopwordsCN: Return Chinese stop words.
strcap: Mixed case capitalizing.
strextract: Extract matched substrings by regular expression.
strpad: Pad a string to a specified length with a padding character.
strstrip: Trim space of a string.
tmcnTest: Run unit tests.
toPinyin: Convert a chinese text to pinyin format.
toTrad: Convert a Chinese text from simplified to traditional...
toUTF8: Convert encoding of Chinese string to UTF-8.

Functions

Files

DESCRIPTION
NAMESPACE
R
R/catUTF8.R
R/createHashmapEnv.R
R/deprecated.R
R/getCharset.R
R/getWordFreq.R
R/isBIG5.R
R/isGB18030.R
R/isGB2312.R
R/isGBK.R
R/isUTF8.R
R/plotWordcloud.R
R/revUTF8.R
R/setchs.R
R/setcht.R
R/stopwordsCN.R
R/strcap.R
R/strextract.R
R/strpad.R
R/strstrip.R
R/tmcnTest.R
R/toPinyin.R
R/toTrad.R
R/toUTF8.R
R/utils.R
R/zzz.R
data
data/GBK.rda
data/NTUSD.rda
data/SIMTRA.rda
demo
demo/00Index
demo/demo.R
inst
inst/dic
inst/dic/stopwords.txt
inst/unittests
inst/unittests/runit.strextract.R
inst/unittests/runit.strpad.R
inst/unittests/runit.strstrip.R
man
man/GBK.Rd
man/NTUSD.Rd
man/SIMTRA.Rd
man/catUTF8.Rd
man/createHashmapEnv.Rd
man/getCharset.Rd
man/getWordFreq.Rd
man/isBIG5.Rd
man/isGB18030.Rd
man/isGB2312.Rd
man/isGBK.Rd
man/isUTF8.Rd
man/revUTF8.Rd
man/setchs.Rd
man/setcht.Rd
man/stopwordsCN.Rd
man/strcap.Rd
man/strextract.Rd
man/strpad.Rd
man/strstrip.Rd
man/tmcnTest.Rd
man/toPinyin.Rd
man/toTrad.Rd
man/toUTF8.Rd
src
src/tmcn_encoding_isbig5.cpp
src/tmcn_encoding_isgb18030.cpp
src/tmcn_encoding_isgb2312.cpp
src/tmcn_encoding_isgbk.cpp
src/tmcn_encoding_isutf8.cpp
tmcn documentation built on May 21, 2017, 12:37 a.m.

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.

Please suggest features or report bugs in the GitHub issue tracker.

All documentation is copyright its authors; we didn't write any of that.