Kiwi | R Documentation |
Kiwi class is provide method for korean mophological analyze result.
print()
print method for Kiwi
objects
Kiwi$print(x, ...)
x
self
...
ignored
new()
Create a kiwi instance.
Kiwi$new( num_workers = 0, model_size = "base", integrate_allomorph = TRUE, load_default_dict = TRUE )
num_workers
int(optional)
: use multi-thread core number. default is 0 which means use all core.
model_size
char(optional)
: kiwi model select. default is "base". "small", "large" is available.
integrate_allomorph
bool(optional)
: default is TRUE.
load_default_dict
bool(optional)
: use defualt dictionary. default is TRUE.
add_user_word()
add user word with pos and score
Kiwi$add_user_word(word, tag, score, orig_word = "")
word
char(required)
: target word to add.
tag
Tags(required)
: tag information about word.
score
num(required)
: score information about word.
orig_word
char(optional)
: origin word.
add_pre_analyzed_words()
TODO
Kiwi$add_pre_analyzed_words(form, analyzed, score)
form
char(required)
: target word to add analyzed result.
analyzed
data.frame(required)
: analyzed result expected.
score
num(required)
: score information about pre analyzed result.
add_rules()
TODO
Kiwi$add_rules(tag, pattern, replacement, score)
tag
Tags(required)
: target tag to add rules.
pattern
char(required)
: regular expression.
replacement
char(required)
: replace text.
score
num(required)
: score information about rules.
load_user_dictionarys()
add user dictionary using text file.
Kiwi$load_user_dictionarys(user_dict_path)
user_dict_path
char(required)
: path of user dictionary file.
extract_words()
Extract Noun word candidate from texts.
Kiwi$extract_words( input, min_cnt, max_word_len, min_score, pos_threshold, apply = FALSE )
input
char(required)
: target text data
min_cnt
int(required)
: minimum count of word in text.
max_word_len
int(required)
: max word length.
min_score
num(required)
: minimum score.
pos_threshold
num(required)
: pos threashold.
apply
bool(optional)
: apply extracted word as user word dict.
analyze()
Analyze text to token and tag results.
Kiwi$analyze(text, top_n = 3, match_option = Match$ALL, stopwords = FALSE)
text
char(required)
: target text.
top_n
int(optional)
: number of result. Default is 3.
match_option
match_option Match
: use Match. Default is Match$ALL
stopwords
stopwords option. Default is FALSE which is use nothing.
If TRUE
, use embaded stopwords dictionany.
If char
: path of dictionary txt file, use file.
If Stopwords
class, use it.
If not valid value, work same as FALSE.
list
of result.
tokenize()
Analyze text to token and pos result just top 1.
Kiwi$tokenize( text, match_option = Match$ALL, stopwords = FALSE, form = "tibble" )
text
char(required)
: target text.
match_option
match_option Match
: use Match. Default is Match$ALL
stopwords
stopwords option. Default is FALSE which is use nothing.
If TRUE
, use embaded stopwords dictionany.
If char
: path of dictionary txt file, use file.
If Stopwords
class, use it.
If not valid value, work same as FALSE.
form
char(optional)
: return form. default is "tibble".
"list", "tidytext" is available.
split_into_sents()
Some text may not split sentence by sentence. split_into_sents works split sentences to sentence by sentence.
Kiwi$split_into_sents(text, match_option = Match$ALL, return_tokens = FALSE)
text
char(required)
: target text.
match_option
match_option Match
: use Match. Default is Match$ALL
return_tokens
bool(optional)
: add tokenized resault.
get_tidytext_func()
set function to tidytext unnest_tokens.
Kiwi$get_tidytext_func(match_option = Match$ALL, stopwords = FALSE)
match_option
match_option Match
: use Match. Default is Match$ALL
stopwords
stopwords option. Default is TRUE which is
to use embaded stopwords dictionary.
If FALSE, use not embaded stopwords dictionary.
If char: path of dictionary txt file, use file.
If Stopwords
class, use it.
If not valid value, work same as FALSE.
function
\dontrun{ kw <- Kiwi$new() tidytoken <- kw$get_tidytext_func() tidytoken("test") }
clone()
The objects of this class are cloneable with this method.
Kiwi$clone(deep = FALSE)
deep
Whether to make a deep clone.
## Not run: kw <- Kiwi$new() kw$analyze("test") kw$tokenize("test") ## End(Not run) ## ------------------------------------------------ ## Method `Kiwi$get_tidytext_func` ## ------------------------------------------------ ## Not run: kw <- Kiwi$new() tidytoken <- kw$get_tidytext_func() tidytoken("test") ## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.