nlp_relation_extraction: Spark NLP RelationExtractionApproach
In r-spark/sparknlp: R Interface to John Snow Labs Spark NLP

nlp_relation_extraction

R Documentation

Spark NLP RelationExtractionApproach

Description

Spark ML estimator that trains a TensorFlow model for relation extraction. The Tensorflow graph in .pb format needs to be specified with setModelFile. The result is a RelationExtractionModel. To start training, see the parameters that need to be set in the Parameters section.

See https://nlp.johnsnowlabs.com/docs/en/licensed_annotators#relationextraction

Usage

nlp_relation_extraction(
  x,
  input_cols,
  output_col,
  batch_size = NULL,
  dropout = NULL,
  epochs_number = NULL,
  feature_scaling = NULL,
  fix_imbalance = NULL,
  from_entity_begin_col = NULL,
  from_entity_end_col = NULL,
  from_entity_label_col = NULL,
  label_col = NULL,
  learning_rate = NULL,
  model_file = NULL,
  output_logs_path = NULL,
  to_entity_begin_col = NULL,
  to_entity_end_col = NULL,
  to_entity_label_col = NULL,
  validation_split = NULL,
  uid = random_string("relation_extraction_")
)

Arguments

`x`	A `spark_connection`, `ml_pipeline`, or a `tbl_spark`.
`input_cols`	Input columns. String array.
`output_col`	Output column. String.
`batch_size`	batch size
`dropout`	dropout coefficient
`epochs_number`	maximum number of epochs to train
`feature_scaling`	feature scaling method
`fix_imbalance`	Fix the imbalance in the training set by replicating examples of under represented categories
`from_entity_begin_col`	Column for beginning of 'from' entity
`from_entity_end_col`	Column for end of 'from' entity
`from_entity_label_col`	Column for 'from' entity label
`label_col`	Column with label per each document
`learning_rate`	learning rate
`model_file`	location of file of the model used for classification
`output_logs_path`	path to folder to output logs
`to_entity_begin_col`	Column for beginning of 'to' entity
`to_entity_end_col`	Column for end of 'to' entity
`to_entity_label_col`	Column for 'to' entity label
`validation_split`	Choose the proportion of training dataset to be validated against the model on each Epoch.
`uid`	A character string used to uniquely identify the ML estimator.

Value

The object returned depends on the class of x.

spark_connection: When x is a spark_connection, the function returns an instance of a ml_estimator object. The object contains a pointer to a Spark Estimator object and can be used to compose Pipeline objects.
ml_pipeline: When x is a ml_pipeline, the function returns a ml_pipeline with the NLP estimator appended to the pipeline.
tbl_spark: When x is a tbl_spark, an estimator is constructed then immediately fit with the input tbl_spark, returning an NLP model.