create_model_genomenet: Create GenomeNet Model with Given Architecture Parameters

View source: R/create_model_genomenet.R

create_model_genomenetR Documentation

Create GenomeNet Model with Given Architecture Parameters


Create GenomeNet Model with Given Architecture Parameters


  maxlen = 300,
  learning_rate = 0.001,
  number_of_cnn_layers = 1,
  conv_block_count = 1,
  kernel_size_0 = 16,
  kernel_size_end = 16,
  filters_0 = 256,
  filters_end = 512,
  dilation_end = 1,
  max_pool_end = 1,
  dense_layer_num = 1,
  dense_layer_units = 100,
  dropout_lstm = 0,
  dropout = 0,
  batch_norm_momentum = 0.8,
  leaky_relu_alpha = 0,
  dense_activation = "relu",
  skip_block_fraction = 0,
  residual_block = FALSE,
  reverse_encoding = FALSE,
  optimizer = "adam",
  model_type = "gap",
  recurrent_type = "lstm",
  recurrent_layers = 1,
  recurrent_bidirectional = FALSE,
  recurrent_units = 100,
  vocabulary_size = 4,
  last_layer_activation = "softmax",
  loss_fn = "categorical_crossentropy",
  auc_metric = FALSE,
  num_targets = 2,
  model_seed = NULL,
  bal_acc = FALSE,
  f1_metric = FALSE,
  mixed_precision = FALSE,
  mirrored_strategy = NULL



(integer numeric(1))
Input sequence length.


Used by the keras optimizer that is specified by optimizer.


(integer numeric(1))
Target number of CNN-layers to use in total. If number_of_cnn_layers is greater than conv_block_count, then the effective number of CNN layers is set to the closest integer that is divisible by conv_block_count.


(integer numeric(1))
Number of convolutional blocks, into which the CNN layers are divided. If this is greater than number_of_cnn_layers, then it is set to number_of_cnn_layers (the convolutional block size will then be 1).
Convolutional blocks are used when model_type is "gap" (the output of the last conv_block_count * (1 - skip_block_fraction) blocks is fed to global average pooling and then concatenated), and also when residual_block is TRUE (the number of filters is held constant within blocks). If neither of these is the case, conv_block_count has little effect besides the fact that number_of_cnn_layers is set to the closest integer divisible by conv_block_count.


Target CNN kernel size of the first CNN-layer. Although CNN kernel size is always an integer, this value can be non-integer, potentially affecting the kernel-sizes of intermediate layers (which are geometrically interpolated between kernel_size_0 and kernel_size_end).


Target CNN kernel size of the last CNN-layer; ignored if only one CNN-layer is used (i.e. if number_of_cnn_layers is 1). Although CNN kernel size is always an integer, this value can be non-integer, potentially affecting the kernel-sizes of intermediate layers (which are geometrically interpolated between kernel_size_0 and kernel_size_end).


Target filter number of the first CNN-layer. Although CNN filter number is always an integer, this value can be non-integer, potentially affecting the filter-numbers of intermediate layers (which are geometrically interpolated between filters_0 and filters_end).
Note that filters are constant within convolutional blocks when residual_block is TRUE.


Target filter number of the last CNN-layer; ignored if only one CNN-layer is used (i.e. if number_of_cnn_layers is 1). Although CNN filter number is always an integer, this value can be non-integer, potentially affecting the filter-numbers of intermediate dilation_rates layers (which are geometrically interpolated between kernel_size_0 and kernel_size_end).
Note that filters are constant within convolutional blocks when residual_block is TRUE.


Dilation of the last CNN-layer within each block. Dilation rates within each convolutional block grows exponentially from 1 (no dilation) for the first CNN-layer to each block, to this value. Set to 1 (default) to disable dilation.


Target total effective pooling of CNN part of the network. "Effective pooling" here is the product of the pooling rates of all previous CNN-layers. A network with three CNN-layers, all of which are followed by pooling layers of size 2, therefore has effective pooling of 8, with the effective pooling at intermediate positions being 1 (beginning), 2, and 4. Effective pooling after each layer is set to the power of 2 that is, on a logarithmic scale, closest to ⁠max_pool_end ^ (<CNN layer number> / <total number of CNN layers>)⁠. Therefore, even though the total effective pooling size of the whole CNN part of the network will always be a power of 2, having different, possibly non-integer values of max_pool_end, will still lead to different networks.


(integer numeric(1))
number of dense layers at the end of the network, not counting the output layer.


(integer numeric(1))
Number of units in each dense layer, except for the output layer.


Fraction of the units to drop for inputs.


Dropout rate of dense layers, except for the output layer.


momentum-parameter of layer_batch_normalization layers used in the convolutional part of the network.


alpha-parameter of the layer_activation_leaky_relu activation layers used in the convolutional part of the network.


Which activation function to use for dense layers. Should be one of "relu", "sigmoid", or "tanh".


What fraction of the first convolutional blocks to skip. Only used when model_type is "gap".


Whether to use residual layers in the convolutional part of the network.


Whether the network should have a second input for reverse-complement sequences.


Which optimizer to use. One of "adam", "adagrad", "rmsprop", or "sgd".


Whether to use the global average pooling ("gap") or recurrent ("recurrent") model type.


Which recurrent network type to use. One of "lstm" or "gru". Only used when model_type is "recurrent".


(integer numeric(1))
Number of recurrent layers. Only used when model_type is "recurrent".


Whether to use bidirectional recurrent layers. Only used when model_type is "recurrent".


(integer numeric(1))
Number of units in each recurrent layer. Only used when model_type is "recurrent".


(integer numeric(1))
Vocabulary size of (one-hot encoded) input strings. This determines the input tensor shape, together with maxlen.


Either "sigmoid" or "softmax".


Either "categorical_crossentropy" or "binary_crossentropy". If label_noise_matrix given, will use custom "noisy_loss".


Whether to add AUC metric.


(integer numeric(1))
Number of output units to create.


Set seed for model parameters in tensorflow if not NULL.


Whether to add balanced accuracy.


Whether to add F1 metric.


Whether to use mixed precision (


Whether to use distributed mirrored strategy. If NULL, will use distributed mirrored strategy only if >1 GPU available.


A keras model.

A keras model implementing genomenet architecture.


model <- create_model_genomenet()

GenomeNet/deepG documentation built on Jan. 25, 2025, 12:05 a.m.