tokenize: Call 'Sudachi' tokenizer

View source: R/tokenize.R

tokenizeR Documentation

Call 'Sudachi' tokenizer

Description

Call 'Sudachi' tokenizer

Usage

tokenize(x, text_field, docid_field, instance)

Arguments

x

A data.frame like object or a character vector to be tokenized.

text_field

Column name where to get texts to be tokenized.

docid_field

Column name where to get identifiers of texts.

instance

A binding to the instance of <sudachipy.tokenizer.Tokenizer>. If you already have a tokenizer instance, you can improve performance by providing a predefined instance.


uribo/sudachir documentation built on Feb. 7, 2023, 11:09 a.m.