Convolution: Convolution

Description Usage Arguments Details


Layer factory function to create a convolution layer.


Convolution(filter_shape, num_filters = NULL, sequential = FALSE,
  activation = activation_identity, init = init_glorot_uniform(),
  pad = FALSE, strides = 1, sharing = TRUE, bias = TRUE,
  init_bias = 0, reduction_rank = 1, transpose_weight = FALSE,
  max_temp_mem_size_in_samples = 0, op_name = "Convolution", name = "")



int or list of int - shape (spatial extent) of the receptive field, not including the input feature-map depth. E.g. (3,3) for a 2D convolution.


(int, defaults to None) – number of filters (output feature-map depth), or () to denote scalar output items (output shape will have no depth axis).


(Function) - optional activation Function


(scalar or matrix or initializer, defaults to init_glorot_uniform()) – initial value of weights W


(bool or list of bools) – if False, then the operation will be shifted over the “valid” area of input, that is, no value outside the area is used. If pad=True on the other hand, the operation will be applied to all input positions, and positions outside the valid region will be considered containing zero. Use a list to specify a per-axis value.


(int or tuple of ints, defaults to 1) – stride of the operation. Use a list of ints to specify a per-axis value.


(bool) – whether to include bias


(scalar or matrix or initializer, defaults to 0) – initial value of weights b


string (optional) the name of the Function instance in the network


This implements a convolution operation over items arranged on an N-dimensional grid, such as pixels in an image. Typically, each item is a vector (e.g. pixel: R,G,B), and the result is, in turn, a vector. The item-grid dimensions are referred to as the spatial dimensions (e.g. dimensions of an image), while the vector dimension of the individual items is often called feature-map depth.

For each item, convolution gathers a window (“receptive field”) of items surrounding the item’s position on the grid, and applies a little fully-connected network to it (the same little network is applied to all item positions). The size (spatial extent) of the receptive field is given by filter_shape. E.g. to specify a 2D convolution, filter_shape should be a tuple of two integers, such as (5,5); an example for a 3D convolution (e.g. video or an MRI scan) would be filter_shape=(3,3,3); while for a 1D convolution (e.g. audio or text), filter_shape has one element, such as (3,) or just 3.

The dimension of the input items (input feature-map depth) is not to be specified. It is known from the input. The dimension of the output items (output feature-map depth) generated for each item position is given by num_filters.

If the input is a sequence, the sequence elements are by default treated independently. To convolve along the sequence dimension as well, pass sequential=True. This is useful for variable-length inputs, such as video or natural-language processing (word n-grams). Note, however, that convolution does not support sparse inputs.

Both input and output items can be scalars intead of vectors. For scalar-valued input items, such as pixels on a black-and-white image, or samples of an audio clip, specify reduction_rank=0. If the output items are scalar, pass num_filters=() or None.

A Convolution instance owns its weight parameter tensors W and b, and exposes them as an attributes .W and .b. The weights will have the shape (num_filters, input_feature_map_depth, *filter_shape)

joeddav/CNTK-R documentation built on May 6, 2019, 7:28 a.m.