SentencePieceTokenizer: SentencePieceTokenizer

Description Usage Arguments Value

View source: R/text_core.R

Description

SentencePiece tokenizer for 'lang'

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
SentencePieceTokenizer(
  lang = "en",
  special_toks = NULL,
  sp_model = NULL,
  vocab_sz = NULL,
  max_vocab_sz = 30000,
  model_type = "unigram",
  char_coverage = NULL,
  cache_dir = "tmp"
)

Arguments

lang

lang

special_toks

special_toks

sp_model

sp_model

vocab_sz

vocab_sz

max_vocab_sz

max_vocab_sz

model_type

model_type

char_coverage

char_coverage

cache_dir

cache_dir

Value

None


fastai documentation built on Oct. 25, 2021, 5:08 p.m.