View source: R/graph-extraction.R
nlp_graph_extraction | R Documentation |
Spark ML transformer that Extracts a dependency graph between entities. The GraphExtraction class takes e.g. extracted entities from a NerDLModel and creates a dependency tree which describes how the entities relate to each other. For that a triple store format is used. Nodes represent the entities and the edges represent the relations between those entities. The graph can then be used to find relevant relationships between words.
nlp_graph_extraction( x, input_cols, output_col, delimiter = NULL, dependency_parser_model = NULL, entity_types = NULL, explode_entities = NULL, include_edges = NULL, max_sentence_size = NULL, merge_entities = NULL, merge_entities_iob_format = NULL, min_sentence_size = NULL, pos_model = NULL, relationship_types = NULL, root_tokens = NULL, typed_dependency_parser_model = NULL, uid = random_string("graph_extraction_") )
x |
A |
input_cols |
Input columns. String array. |
output_col |
Output column. String. |
delimiter |
Delimiter symbol used for path output (Default: ",") |
dependency_parser_model |
Coordinates (name, lang, remoteLoc) to a pretrained Dependency Parser model (Default: Array()) |
entity_types |
Find paths between a pair of entities (Default: Array()) |
explode_entities |
When set to true find paths between entities (Default: false) |
include_edges |
Whether to include edges when building paths (Default: true) |
max_sentence_size |
Maximum sentence size that the annotator will process (Default: 1000). |
merge_entities |
Merge same neighboring entities as a single token (Default: false) |
merge_entities_iob_format |
IOB format to apply when merging entities |
min_sentence_size |
Minimum sentence size that the annotator will process (Default: 2). |
pos_model |
Coordinates (name, lang, remoteLoc) to a pretrained POS model (Default: Array()) |
relationship_types |
Find paths between a pair of token and entity (Default: Array()) |
root_tokens |
Tokens to be consider as root to start traversing the paths (Default: Array()). |
typed_dependency_parser_model |
Coordinates (name, lang, remoteLoc) to a pretrained Typed Dependency Parser model (Default: Array()) |
uid |
A character string used to uniquely identify the ML estimator. |
Both the DependencyParserModel and TypedDependencyParserModel need to be present in the pipeline. There are two ways to set them:
Both Annotators are present in the pipeline already. The dependencies are taken implicitly from these two Annotators. Setting setMergeEntities to true will download the default pretrained models for those two Annotators automatically. The specific models can also be set with setDependencyParserModel and setTypedDependencyParserModel:
See https://nlp.johnsnowlabs.com/docs/en/annotators#graphextraction
The object returned depends on the class of x
.
spark_connection
: When x
is a spark_connection
, the function returns an instance of a ml_estimator
object. The object contains a pointer to
a Spark Estimator
object and can be used to compose
Pipeline
objects.
ml_pipeline
: When x
is a ml_pipeline
, the function returns a ml_pipeline
with
the NLP estimator appended to the pipeline.
tbl_spark
: When x
is a tbl_spark
, an estimator is constructed then
immediately fit with the input tbl_spark
, returning an NLP model.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.