View source: R/layer-attention.R
| layer_attention | R Documentation |
Dot-product attention layer, a.k.a. Luong-style attention
layer_attention(
inputs,
use_scale = FALSE,
score_mode = "dot",
...,
dropout = NULL
)
inputs |
List of the following tensors:
|
use_scale |
If |
score_mode |
Function to use to compute attention scores, one of
|
... |
standard layer arguments (e.g., batch_size, dtype, name, trainable, weights) |
dropout |
Float between 0 and 1. Fraction of the units to drop for the attention scores. Defaults to 0.0. |
inputs are query tensor of shape [batch_size, Tq, dim], value tensor
of shape [batch_size, Tv, dim] and key tensor of shape
[batch_size, Tv, dim]. The calculation follows the steps:
Calculate scores with shape [batch_size, Tq, Tv] as a query-key dot
product: scores = tf$matmul(query, key, transpose_b=TRUE).
Use scores to calculate a distribution with shape
[batch_size, Tq, Tv]: distribution = tf$nn$softmax(scores).
Use distribution to create a linear combination of value with
shape [batch_size, Tq, dim]:
return tf$matmul(distribution, value).
Other core layers:
layer_activation(),
layer_activity_regularization(),
layer_dense(),
layer_dense_features(),
layer_dropout(),
layer_flatten(),
layer_input(),
layer_lambda(),
layer_masking(),
layer_permute(),
layer_repeat_vector(),
layer_reshape()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.