embeddings_bert: Create BERT Embeddings

embeddings_bertR Documentation

Create BERT Embeddings

Description

There are three components which are added together to give the input embeddings in a BERT model: the embedding of the tokens themselves, the segment ("token type") embedding, and the position (token index) embedding. This function sets up the embedding layer for all three of these.

Usage

embeddings_bert(
  embedding_size,
  max_position_embeddings,
  vocab_size,
  token_type_vocab_size = 2L,
  hidden_dropout = 0.1
)

Arguments

embedding_size

Integer; the dimension of the embedding vectors.

max_position_embeddings

Integer; maximum number of tokens in each input sequence.

vocab_size

Integer; number of tokens in vocabulary.

token_type_vocab_size

Integer; number of input segments that the model will recognize. (Two for BERT models.)

hidden_dropout

Numeric; the dropout probability to apply to dense layers.

Shape

With sequence_length <= max_position_embeddings:

Inputs:

  • token_ids: (*, sequence_length)

  • token_type_ids: (*, sequence_length)

Output:

  • (*, sequence_length, embedding_size)

Examples

emb_size <- 3L
mpe <- 5L
vs <- 7L
n_inputs <- 2L
# get random "ids" for input
t_ids <- matrix(sample(2:vs, size = mpe * n_inputs, replace = TRUE),
  nrow = n_inputs, ncol = mpe
)
ttype_ids <- matrix(rep(1L, mpe * n_inputs), nrow = n_inputs, ncol = mpe)

model <- embeddings_bert(
  embedding_size = emb_size,
  max_position_embeddings = mpe,
  vocab_size = vs
)
model(
  torch::torch_tensor(t_ids),
  torch::torch_tensor(ttype_ids)
)

macmillancontentscience/torchtransformers documentation built on Aug. 6, 2023, 5:35 a.m.