embedding_postprocessor: Perform various post-processing on a word embedding tensor
In jonathanbratt/RBERT: R Implementation of BERT

embedding_postprocessor

R Documentation

Perform various post-processing on a word embedding tensor

Description

This function (optionally) adds to the word embeddings additional embeddings for token type and position.

Usage

embedding_postprocessor(
  input_tensor,
  use_token_type = FALSE,
  token_type_ids = NULL,
  token_type_vocab_size = 16L,
  token_type_embedding_name = "token_type_embeddings",
  use_position_embeddings = TRUE,
  position_embedding_name = "position_embeddings",
  initializer_range = 0.02,
  max_position_embeddings = 512L,
  dropout_prob = 0.1
)

Arguments

`input_tensor`	Float Tensor of shape `[batch_size, seq_length, embedding_size]`.
`use_token_type`	Logical; whether to add embeddings for `token_type_ids`.
`token_type_ids`	(optional) Integer Tensor of shape `[batch_size, seq_length]`. Must be specified if `use_token_type` is TRUE
`token_type_vocab_size`	Integer; the vocabulary size of `token_type_ids`. This defaults to 16 (here and in BERT code), but must be set to 2 for compatibility with saved checkpoints.
`token_type_embedding_name`	Character; the name of the embedding table variable for token type ids.
`use_position_embeddings`	Logical; whether to add position embeddings for the position of each token in the sequence.
`position_embedding_name`	Character; the name of the embedding table variable for positional embeddings.
`initializer_range`	Numeric; range of the weight initialization.
`max_position_embeddings`	Integer; maximum sequence length that might ever be used with this model. This can be longer than the sequence length of input_tensor, but cannot be shorter.
`dropout_prob`	Numeric; dropout probability applied to the final output tensor.

Details

See figure 2 in the BERT paper:

https://arxiv.org/pdf/1810.04805.pdf

Both type and position embeddings are learned model variables. Note that token "type" is essentially a sentence identifier, indicating which sentence (or, more generally, piece of text) the token belongs to.

Value

Float Tensor with same shape as input_tensor.

Examples

## Not run: 
batch_size <- 10
seq_length <- 512
embedding_size <- 200
with(tensorflow::tf$variable_scope("examples",
  reuse = tensorflow::tf$AUTO_REUSE
), {
  input_tensor <- tensorflow::tf$get_variable(
    "input",
    dtype = "float",
    shape = tensorflow::shape(batch_size, seq_length, embedding_size)
  )
  token_type_ids <- tensorflow::tf$get_variable(
    "ids",
    dtype = "int32",
    shape = tensorflow::shape(batch_size, seq_length)
  )
})
embedding_postprocessor(input_tensor,
  use_token_type = TRUE,
  token_type_ids = token_type_ids
)

## End(Not run)

jonathanbratt/RBERT documentation built on Jan. 26, 2023, 4:15 p.m.