| BaseModelFunnel | R Documentation |
Represents models based on the Funnel-Transformer.
Does return a new object of this class.
aifeducation::AIFEMaster -> aifeducation::AIFEBaseModel -> aifeducation::BaseModelCore -> BaseModelFunnel
aifeducation::AIFEMaster$get_all_fields()aifeducation::AIFEMaster$get_documentation_license()aifeducation::AIFEMaster$get_ml_framework()aifeducation::AIFEMaster$get_model_config()aifeducation::AIFEMaster$get_model_description()aifeducation::AIFEMaster$get_model_info()aifeducation::AIFEMaster$get_model_license()aifeducation::AIFEMaster$get_package_versions()aifeducation::AIFEMaster$get_private()aifeducation::AIFEMaster$get_publication_info()aifeducation::AIFEMaster$get_sustainability_data()aifeducation::AIFEMaster$is_configured()aifeducation::AIFEMaster$is_trained()aifeducation::AIFEMaster$set_documentation_license()aifeducation::AIFEMaster$set_model_description()aifeducation::AIFEMaster$set_model_license()aifeducation::BaseModelCore$calc_flops_architecture_based()aifeducation::BaseModelCore$count_parameter()aifeducation::BaseModelCore$create_from_hf()aifeducation::BaseModelCore$estimate_sustainability_inference_fill_mask()aifeducation::BaseModelCore$fill_mask()aifeducation::BaseModelCore$get_final_size()aifeducation::BaseModelCore$get_flops_estimates()aifeducation::BaseModelCore$get_model()aifeducation::BaseModelCore$get_model_type()aifeducation::BaseModelCore$get_special_tokens()aifeducation::BaseModelCore$get_tokenizer_statistics()aifeducation::BaseModelCore$load_from_disk()aifeducation::BaseModelCore$plot_training_history()aifeducation::BaseModelCore$save()aifeducation::BaseModelCore$set_publication_info()aifeducation::BaseModelCore$train()configure()Configures a new object of this class. Please ensure that your chosen configuration comply with the following guidelines:
hidden_size is a multiple of num_attention_heads.
BaseModelFunnel$configure( tokenizer, max_position_embeddings = 512L, hidden_size = 768L, block_sizes = c(4L, 4L, 4L), num_attention_heads = 12L, intermediate_size = 3072L, num_decoder_layers = 2L, d_head = 64L, funnel_pooling_type = "Mean", hidden_act = "GELU", hidden_dropout_prob = 0.1, attention_probs_dropout_prob = 0.1, activation_dropout = 0 )
tokenizerTokenizerBase Tokenizer for the model.
max_position_embeddingsint Number of maximum position embeddings. This parameter also determines the maximum length of a sequence which
can be processed with the model. Allowed values: 10 <= x <= 4048
hidden_sizeint Number of neurons in each layer. This parameter determines the dimensionality of the resulting text
embedding. Allowed values: 1 <= x <= 2048
block_sizesvector vector of int determining the number and sizes of each block.
num_attention_headsint determining the number of attention heads for a self-attention layer. Only relevant if attention_type='multihead' Allowed values: 0 <= x
intermediate_sizeint determining the size of the projection layer within a each transformer encoder. Allowed values: 1 <= x
num_decoder_layersint Number of decoding layers. Allowed values: 1 <= x
d_headint Number of neurons of the final layer. Allowed values: 1 <= x
funnel_pooling_typestring Method for pooling over the seqence length. Allowed values: 'Mean', 'Max'
hidden_actstring Name of the activation function. Allowed values: 'GELU', 'relu', 'silu', 'gelu_new'
hidden_dropout_probdouble Ratio of dropout. Allowed values: 0 <= x <= 0.6
attention_probs_dropout_probdouble Ratio of dropout for attention probabilities. Allowed values: 0 <= x <= 0.6
activation_dropoutdouble Dropout probability between the layers of the feed-forward blocks. Allowed values: 0 <= x <= 0.6
num_hidden_layersint Number of hidden layers. Allowed values: 1 <= x
Does nothing return.
get_n_layers()Number of layers.
BaseModelFunnel$get_n_layers()
Returns an int describing the number of layers available for
embedding.
clone()The objects of this class are cloneable with this method.
BaseModelFunnel$clone(deep = FALSE)
deepWhether to make a deep clone.
Dai, Z., Lai, G., Yang, Y. & Le, Q. V. (2020). Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.48550/arXiv.2006.03236")}
Other Base Model:
BaseModelBert,
BaseModelDebertaV2,
BaseModelMPNet,
BaseModelModernBert,
BaseModelRoberta
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.