Man pages for paithiov909/audubon
Japanese Text Processing Tools

audubon-packageaudubon: Japanese Text Processing Tools
bind_lrBind importance of bigrams
bind_tf_idf2Bind term frequency and inverse document frequency
collapse_tokensCollapse sequences of tokens by condition
get_dict_featuresGet dictionary's features
hirobaWhole tokens of 'Porano no Hiroba' written by Miyazawa Kenji...
lex_densityCalculate lexical density
mute_tokensMute tokens by condition
ngram_tokenizerNgrams tokenizer
packPack a data.frame of tokens
polanoWhole text of 'Porano no Hiroba' written by Miyazawa Kenji...
prettifyPrettify tokenized output
read_rewrite_defRead a rewrite.def file
strj_fill_iter_markFill Japanese iteration marks
strj_hiraganizeHiraganize Japanese characters
strj_katakanizeKatakanize Japanese characters
strj_normalizeConvert text following the rules of 'NEologd'
strj_rewrite_as_defRewrite text using rewrite.def
strj_romanizeRomanize Japanese Hiragana and Katakana
strj_segmentSegment text into tokens
strj_tinysegSegment text into phrases
strj_tokenizeSplit text into tokens
strj_transcribe_numTranscribe Arabic to Kansuji
paithiov909/audubon documentation built on April 27, 2024, 10:11 a.m.