SentencePieceTokenizer: SentencePieceTokenizer

View source: R/text_core.R

SentencePieceTokenizerR Documentation

SentencePieceTokenizer

Description

SentencePiece tokenizer for 'lang'

Usage

SentencePieceTokenizer(
  lang = "en",
  special_toks = NULL,
  sp_model = NULL,
  vocab_sz = NULL,
  max_vocab_sz = 30000,
  model_type = "unigram",
  char_coverage = NULL,
  cache_dir = "tmp"
)

Arguments

lang

lang

special_toks

special_toks

sp_model

sp_model

vocab_sz

vocab_sz

max_vocab_sz

max_vocab_sz

model_type

model_type

char_coverage

char_coverage

cache_dir

cache_dir

Value

None


fastai documentation built on March 31, 2023, 11:41 p.m.