Man pages for tmcn
A Text Mining Toolkit for Chinese

catUTF8Print the UTF-8 codes of a string.
createDTMCreate a Chinese term-document matrix or a document-term...
createWordFreqCreate a word frequency data.frame.
GBKGBK character set
getCharsetGet the current encoding of the locale.
isBIG5Indicate whether the encoding of input string is BIG5.
isGB18030Indicate whether the encoding of input string is GB18030.
isGB2312Indicate whether the encoding of input string is GB2312.
isGBKIndicate whether the encoding of input string is GBK.
isUTF8Indicate whether the encoding of input string is UTF-8.
leftExtract the left or right substrings in a character vector.
NTUSDNational Taiwan University Semantic Dictionary
revUTF8Revert UTF-8 string to Chinese character.
setchsSet locale to Simplified Chinese/Traditional Chinese/UK.
SIMTRADictionary of simplified and traditional Chinese
SPORTSport news.
STOPWORDSDictionary of Chinese stop words
stopwordsCNReturn Chinese stop words.
strcapMixed case capitalizing.
strextractExtract matched substrings by regular expression.
strpadPad a string to a specified length with a padding character.
strstripTrim space of a string.
toPinyinConvert a chinese text to pinyin format.
toTradConvert a Chinese text from simplified to traditional...
toUTF8Convert encoding of Chinese string to UTF-8.
tmcn documentation built on Aug. 8, 2019, 9:02 a.m.