convert_by_vocab: Convert a sequence of tokens/ids using the provided vocab.

View source: R/tokenization.R

convert_by_vocabR Documentation

Convert a sequence of tokens/ids using the provided vocab.

Description

Convert a sequence of tokens/ids using the provided vocab.

Usage

convert_by_vocab(vocab, items)

convert_tokens_to_ids(vocab, tokens)

convert_ids_to_tokens(inv_vocab, ids)

Arguments

vocab

Vocabulary; provides mapping from index to tokens. (This may be in fact an "inverse vocabulary", where the names are the indices and the values are the tokens.)

items

Vector of the keys (names in the vocab vector) to "convert".

tokens

Equivalent to items.

inv_vocab

Equivalent to vocab.

ids

Equivalent to items.

Value

Vector of the values in 'vocab' corresponding to 'items'. (The names on the returned vector are kept.)

Functions

  • convert_tokens_to_ids: Wrapper function for specifically converting tokens to ids.

  • convert_ids_to_tokens: Wrapper function for specifically converting ids to tokens.

Examples

convert_by_vocab(c("token1" = 0, "token2" = 1), "token1")

jonathanbratt/RBERT documentation built on Jan. 26, 2023, 4:15 p.m.