BasicTokenizer: Construct objects of BasicTokenizer class.

View source: R/tokenization.R

BasicTokenizerR Documentation

Construct objects of BasicTokenizer class.

Description

(I'm not sure that this object-based approach is best for R implementation, but for now just trying to reproduce python functionality.)

Usage

BasicTokenizer(do_lower_case = TRUE)

Arguments

do_lower_case

Logical; the value to give to the "do_lower_case" argument in the BasicTokenizer object.

Details

Has methods: 'tokenize.BasicTokenizer()' 'run_strip_accents.BasicTokenizer()' (internal use) 'run_split_on_punc.BasicTokenizer()' (internal use) 'tokenize_chinese_chars.BasicTokenizer()' (internal use) 'is_chinese_char.BasicTokenizer()' (internal use) 'clean_text.BasicTokenizer()' (internal use)

Value

an object of class BasicTokenizer

Examples

## Not run: 
b_tokenizer <- BasicTokenizer(TRUE)

## End(Not run)

jonathanbratt/RBERT documentation built on Jan. 26, 2023, 4:15 p.m.