View source: R/layer-attention.R
layer_attention | R Documentation |
Dot-product attention layer, a.k.a. Luong-style attention
layer_attention(
inputs,
use_scale = FALSE,
score_mode = "dot",
...,
dropout = NULL
)
inputs |
List of the following tensors:
|
use_scale |
If |
score_mode |
Function to use to compute attention scores, one of
|
... |
standard layer arguments (e.g., batch_size, dtype, name, trainable, weights) |
dropout |
Float between 0 and 1. Fraction of the units to drop for the attention scores. Defaults to 0.0. |
inputs are query
tensor of shape [batch_size, Tq, dim]
, value
tensor
of shape [batch_size, Tv, dim]
and key
tensor of shape
[batch_size, Tv, dim]
. The calculation follows the steps:
Calculate scores with shape [batch_size, Tq, Tv]
as a query
-key
dot
product: scores = tf$matmul(query, key, transpose_b=TRUE)
.
Use scores to calculate a distribution with shape
[batch_size, Tq, Tv]
: distribution = tf$nn$softmax(scores)
.
Use distribution
to create a linear combination of value
with
shape [batch_size, Tq, dim]
:
return tf$matmul(distribution, value)
.
Other core layers:
layer_activation()
,
layer_activity_regularization()
,
layer_dense()
,
layer_dense_features()
,
layer_dropout()
,
layer_flatten()
,
layer_input()
,
layer_lambda()
,
layer_masking()
,
layer_permute()
,
layer_repeat_vector()
,
layer_reshape()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.