nlp_chunk_entity_resolver_pretrained: Load a pretrained Spark NLP Chunk Entity Resolver model
In r-spark/sparknlp: R Interface to John Snow Labs Spark NLP

nlp_chunk_entity_resolver_pretrained

R Documentation

Load a pretrained Spark NLP Chunk Entity Resolver model

Description

Create a pretrained Spark NLP ChunkEntityResolverModel model

Usage

nlp_chunk_entity_resolver_pretrained(
  sc,
  input_cols,
  output_col,
  all_distances_metadata = NULL,
  alternatives = NULL,
  case_sensitive = NULL,
  confidence_function = NULL,
  distance_function = NULL,
  distance_weights = NULL,
  enable_jaccard = NULL,
  enable_jaro_winkler = NULL,
  enable_levenshtein = NULL,
  enable_sorensen_dice = NULL,
  enable_tfidf = NULL,
  enable_wmd = NULL,
  extra_mass_penalty = NULL,
  miss_as_empty = NULL,
  neighbors = NULL,
  pooling_strategy = NULL,
  threshold = NULL,
  name,
  lang = NULL,
  remote_loc = NULL
)

Arguments

`sc`	A Spark connection
`input_cols`	Input columns. String array.
`output_col`	Output column. String.
`all_distances_metadata`	whether or not to return an all distance values in the metadata.
`alternatives`	number of results to return in the metadata after sorting by last distance calculated
`case_sensitive`	whether to treat the entities as case sensitive
`confidence_function`	what function to use to calculate confidence: INVERSE or SOFTMAX
`distance_function`	what distance function to use for KNN: 'EUCLIDEAN' or 'COSINE'
`distance_weights`	distance weights to apply before pooling: (WMD, TFIDF, Jaccard, SorensenDice, JaroWinkler, Levenshtein)
`enable_jaccard`	whether or not to use Jaccard token distance.
`enable_jaro_winkler`	whether or not to use Jaro-Winkler character distance.
`enable_levenshtein`	whether or not to use Levenshtein character distance.
`enable_sorensen_dice`	whether or not to use Sorensen-Dice token distance.
`enable_tfidf`	whether or not to use TFIDF token distance.
`enable_wmd`	whether or not to use WMD token distance.
`extra_mass_penalty`	penalty for extra words in the knowledge base match during WMD calculation
`miss_as_empty`	whether or not to return an empty annotation on unmatched chunks
`neighbors`	number of neighbours to consider in the KNN query to calculate WMD
`pooling_strategy`	pooling strategy to aggregate distances: AVERAGE or SUM
`threshold`	threshold value for the aggregated distance#'
`name`	the name of the model to load. If NULL will use the default value
`lang`	the language of the model to be loaded. If NULL will use the default value
`remote_loc`	the remote location of the model. If NULL will use the default value