Chinese Word Segmentation

Description

Chinese word segmentation based on mmseg4j

Usage

1
mmseg4j(text, method = c("complex", "maxword", "simple"), dicDir = NULL)

Arguments

text

A string vector

method

Method of segmentation

dicDir

Directory of user provided dictionary. If NULL, it sets to userDic in the root besides the default dictionaries.

Details

It is a wrapper function to a Java Chinese analyser mmseg4j-1.8.4 http://code.google.com/p/mmseg4j/, which works for simplified Chinese only.

Value

A string vector similar to text, with space between Chinese words.

Note

This functin requires Java Runtime Environment (build 1.6.0_21-b07) or later.

Author(s)

Ronggui HUANG

Examples

1
2
3
4
## Use the following command to open the example file
## then you can copy and paste the commands into R

## file.show(file.path(path.package("rmmseg4j"),"mmseg4jExample.R"))

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.