tokenize: Simple version of tokenizer function.

View source: R/tokenize.R

tokenizeR Documentation

Simple version of tokenizer function.

Description

Simple version of tokenizer function.

Usage

tokenize(text, match_option = Match$ALL, stopwords = TRUE)

tokenize_tbl(text, match_option = Match$ALL, stopwords = TRUE)

tokenize_tidytext(text, match_option = Match$ALL, stopwords = TRUE)

tokenize_tidy(text, match_option = Match$ALL, stopwords = TRUE)

Arguments

text

target text.

match_option

Match: use Match. Default is Match$ALL

stopwords

stopwords option. Default is TRUE which is to use embaded stopwords dictionany. If FALSE, use not embaded stopwords dictionany. If char: path of dictionary txt file, use file. If Stopwords class, use it. If not valid value, work same as FALSE. Check analyze() how to use stopwords param.

Value

list type of result.

Examples

## Not run: 
  tokenize("Test text.")
  tokenize("Please use Korean.", Match$ALL_WITH_NORMALIZING)

## End(Not run)

elbird documentation built on Aug. 12, 2022, 5:08 p.m.