layer_local_attention_1d: Strided block local self-attention.

Description Usage

View source: R/attention-layers.R

Description

The sequence is divided into blocks of length block_length. Attention for agiven query position can see all memory positions in the corresponding block and filter_width many positions to the left and right of the block. q Tensor [batch, heads, length, depth_k] k Tensor [batch, heads, length, depth_k] v Tensor [batch, heads, length, depth_v] Returns Tensor [batch, heads, length, depth_v]

Usage

1
2
3
4
5
6
7
8
layer_local_attention_1d(
  q,
  k,
  v,
  block_length = 1024L,
  filter_width = 100L,
  name = "local_attention_1d"
)

ifrit98/transformR documentation built on Nov. 26, 2019, 2:14 a.m.