docformer_config: Configuration for Docformer models

View source: R/supervised_model.R

docformer_configR Documentation

Configuration for Docformer models

Description

Configuration for Docformer models

Usage

docformer_config(
  pretrained_model_name = NA_character_,
  coordinate_size = 128L,
  shape_size = 128L,
  hidden_dropout_prob = 0.1,
  attention_dropout_prob = 0.1,
  hidden_size = 768L,
  image_feature_pool_shape = c(7, 7, 256),
  intermediate_ff_size_factor = 4L,
  max_2d_position_embeddings = 1024L,
  max_position_embeddings = 512L,
  max_relative_positions = 8L,
  num_attention_heads = 12L,
  num_hidden_layers = 12L,
  vocab_size = 30522L,
  type_vocab_size = 2L,
  layer_norm_eps = 1e-12,
  batch_size = 9L,
  loss = "auto",
  epochs = 5,
  pretraining_ratio = 0.5,
  verbose = FALSE,
  device = "auto"
)

Arguments

pretrained_model_name

(character) : one of the supported model name in transformers_config to derive config from.

coordinate_size

(int): Output size of each coordinate embedding (default 128)

shape_size

(int): Output size of each position embedding (default 128)

hidden_dropout_prob

(float): Dropout probability in docformer_encoder block (default 0.1)

attention_dropout_prob

(float): Dropout probability in docformer_attention block (default 0.1)

hidden_size

(int): Size of the hidden layer in common with text embedding and positional embedding (default 768)

image_feature_pool_shape

(vector of 3 int): Shqpe of the image feature pooling (default c(7,7,256), currently unused)

intermediate_ff_size_factor

(int): Intermediate feed-forward layer expension factor (default 3)

max_2d_position_embeddings

(int): Max size of vector hosting the 2D embedding (default 1024)

max_position_embeddings

(int): Max sequence length for 1D embedding (default 512)

max_relative_positions

(int): Max number of position to look at in multimodal attention layer (default 8)

num_attention_heads

(int): Number of attention heads in the encoder (default 12)

num_hidden_layers

(int): Number of attention layers in the encoder

vocab_size

(int): Length of the vocabulary

type_vocab_size

(int): Length of the type vocabulary

layer_norm_eps

(float): Epsilon value used in normalisation layer (default 1e-12)

batch_size

(int): Size of the batch.

loss

(character or function) Loss function for training (default to mse for regression and cross entropy for classification)

epochs

(int) Number of training epochs.

pretraining_ratio

(float): Ratio of features to mask for reconstruction during pretraining. Ranges from 0 to 1 (default=0.5)

verbose

(bool): Whether to print progress and loss values during training.

device

The device to use for training. "cpu" or "cuda". The default ("auto") uses to "cuda" if it's available, otherwise uses "cpu".

Value

a named list will all needed hyperparameters of the Docformer implementation.

Examples

config <- docformer_config(
  num_attention_heads=6L,
  num_hidden_layers=6L,
  batch_size=27,
  epoch =5,
  verbose=TRUE
  )
config <- docformer_config(
  pretrained_model_name="hf-internal-testing/tiny-layoutlm",
  batch_size=27,
  epoch =5
  )


cregouby/docformer documentation built on May 27, 2023, 11:19 p.m.