nlp_sentence_entity_resolver: Spark NLP SentenceEntityResolverApproach

View source: R/sentence_entity_resolver.R

nlp_sentence_entity_resolverR Documentation

Spark NLP SentenceEntityResolverApproach

Description

Spark ML estimator that assigns a standard code (ICD10 CM, PCS, ICDO; CPT) to sentence embeddings pooled over chunks from TextMatchers or the NER Models. This annotator is particularly handy when working with BertSentenceEmbeddings from the upstream chunks.

Usage

nlp_sentence_entity_resolver(
  x,
  input_cols,
  output_col,
  label_column = NULL,
  normalized_col = NULL,
  neighbors = NULL,
  threshold = NULL,
  miss_as_empty = NULL,
  case_sensitive = NULL,
  confidence_function = NULL,
  distance_function = NULL,
  uid = random_string("sentence_entity_resolver_")
)

Arguments

x

A spark_connection, ml_pipeline, or a tbl_spark.

input_cols

Input columns. String array.

output_col

Output column. String.

label_column

column name for the value we are trying to resolve

normalized_col

column name for the original, normalized description

neighbors

number of neighbors to consider in the KNN query to calculate WMD

threshold

threshold value for the aggregated distance

miss_as_empty

whether or not to return an empty annotation on unmatched chunks

case_sensitive

whether the entity should be considered using case sensitivity

confidence_function

what function to use to calculate confidence: INVERSE or SOFTMAX

distance_function

what distance function to use for KNN: 'EUCLIDEAN' or 'COSINE'

uid

A character string used to uniquely identify the ML estimator.

Details

See https://nlp.johnsnowlabs.com/docs/en/licensed_annotators#sentenceentityresolver

Value

The object returned depends on the class of x.

  • spark_connection: When x is a spark_connection, the function returns an instance of a ml_estimator object. The object contains a pointer to a Spark Estimator object and can be used to compose Pipeline objects.

  • ml_pipeline: When x is a ml_pipeline, the function returns a ml_pipeline with the NLP estimator appended to the pipeline.

  • tbl_spark: When x is a tbl_spark, an estimator is constructed then immediately fit with the input tbl_spark, returning an NLP model.


r-spark/sparknlp documentation built on Oct. 15, 2022, 10:50 a.m.