| transformer_encoder_single_bert | R Documentation |
Build a single layer of a BERT-style attention-based transformer.
transformer_encoder_single_bert(
embedding_size,
intermediate_size = 4 * embedding_size,
n_head,
hidden_dropout = 0.1,
attention_dropout = 0.1
)
embedding_size |
Integer; the dimension of the embedding vectors. |
intermediate_size |
Integer; size of dense layers applied after attention mechanism. |
n_head |
Integer; the number of attention heads per layer. |
hidden_dropout |
Numeric; the dropout probability to apply to dense layers. |
attention_dropout |
Numeric; the dropout probability to apply in attention. |
Inputs:
input: (*, sequence_length, embedding_size)
optional mask: (*, sequence_length)
Output:
embeddings: (*, sequence_length, embedding_size)
weights: (*, n_head, sequence_length, sequence_length)
emb_size <- 4L
seq_len <- 3L
n_head <- 2L
batch_size <- 2L
model <- transformer_encoder_single_bert(
embedding_size = emb_size,
n_head = n_head
)
# get random values for input
input <- array(
sample(
-10:10,
size = batch_size * seq_len * emb_size,
replace = TRUE
) / 10,
dim = c(batch_size, seq_len, emb_size)
)
input <- torch::torch_tensor(input)
model(input)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.