nlp_bert_sentence_chunk_embeddings_pretrained: Load a pretrained Spark NLP BertSentenceChunkEmbeddings model

View source: R/bert_sentence_chunk_embeddings.R

nlp_bert_sentence_chunk_embeddings_pretrainedR Documentation

Load a pretrained Spark NLP BertSentenceChunkEmbeddings model

Description

Create a pretrained Spark NLP BertSentenceChunkEmbeddings model. BERT Sentence embeddings for chunk annotations which take into account the context of the sentence the chunk appeared in. This is an extension of BertSentenceEmbeddings which combines the embedding of a chunk with the embedding of the surrounding sentence. For each input chunk annotation, it finds the corresponding sentence, computes the BERT sentence embedding of both the chunk and the sentence and averages them. The resulting embeddings are useful in cases, in which one needs a numerical representation of a text chunk which is sensitive to the context it appears in.

Usage

nlp_bert_sentence_chunk_embeddings_pretrained(
  sc,
  input_cols,
  output_col,
  case_sensitive = NULL,
  batch_size = NULL,
  dimension = NULL,
  max_sentence_length = NULL,
  name = NULL,
  lang = NULL,
  remote_loc = NULL
)

Arguments

sc

A Spark connection

input_cols

Input columns. String array.

output_col

Output column. String.

case_sensitive

whether to lowercase tokens or not

batch_size

batch size

dimension

defines the output layer of BERT when calculating embeddings

max_sentence_length

max sentence length to process

name

the name of the model to load. If NULL will use the default value

lang

the language of the model to be loaded. If NULL will use the default value

remote_loc

the remote location of the model. If NULL will use the default value

Details

This model is a subclass of BertSentenceEmbeddings and shares all parameters with it. It can load any pretrained BertSentenceEmbeddings model. See https://nlp.johnsnowlabs.com/licensed/api/com/johnsnowlabs/nlp/annotators/embeddings/BertSentenceChunkEmbeddings.html

Value

The Spark NLP model with the pretrained model loaded


r-spark/sparknlp documentation built on Oct. 15, 2022, 10:50 a.m.