nlp_ner_dl | R Documentation |
This Named Entity recognition annotator allows to train generic NER model based on Neural Networks. Its train data (train_ner) is either a labeled or an external CoNLL 2003 IOB based spark dataset with Annotations columns. Also the user has to provide word embeddings annotation column.
nlp_ner_dl( x, input_cols, output_col, label_col = NULL, max_epochs = NULL, lr = NULL, po = NULL, batch_size = NULL, dropout = NULL, verbose = NULL, include_confidence = NULL, include_all_confidence_scores = NULL, random_seed = NULL, graph_folder = NULL, validation_split = NULL, eval_log_extended = NULL, enable_output_logs = NULL, output_logs_path = NULL, enable_memory_optimizer = NULL, uid = random_string("ner_dl_") )
x |
A |
input_cols |
Input columns. String array. |
output_col |
Output column. String. |
label_col |
If DatasetPath is not provided, this seq of Annotation type of column should have labeled data per token (string) |
max_epochs |
Maximum number of epochs to train (integer) |
lr |
Initial learning rate (float) |
po |
Learning rate decay coefficient. Real Learning Rate: lr / (1 + po * epoch) (float) |
batch_size |
Batch size for training (integer) |
dropout |
Dropout coefficient (float) |
verbose |
Verbosity level (integer) |
include_confidence |
whether to include confidence values (boolean) |
include_all_confidence_scores |
whether to include all confidence scores in annotation metadata or just score of the predicted tag (boolean) |
random_seed |
Random seed (integer) |
graph_folder |
folder path that contain external graph files |
validation_split |
proportion of the data to use for validation (float) |
eval_log_extended |
? (boolean) |
enable_output_logs |
whether to enable the TensorFlow output logs (boolean) |
output_logs_path |
path for the output logs |
enable_memory_optimizer |
allow training NerDLApproach on a dataset larger than the memory |
uid |
A character string used to uniquely identify the ML estimator. |
Neural Network architecture is Char CNNs - BiLSTM - CRF that achieves state-of-the-art in most datasets. See https://nlp.johnsnowlabs.com/docs/en/annotators#ner-dl
The object returned depends on the class of x
.
spark_connection
: When x
is a spark_connection
, the function returns an instance of a ml_estimator
object. The object contains a pointer to
a Spark Estimator
object and can be used to compose
Pipeline
objects.
ml_pipeline
: When x
is a ml_pipeline
, the function returns a ml_pipeline
with
the NLP estimator appended to the pipeline.
tbl_spark
: When x
is a tbl_spark
, an estimator is constructed then
immediately fit with the input tbl_spark
, returning an NLP model.
When x
is a spark_connection
the function returns a NerDLApproach estimator.
When x
is a ml_pipeline
the pipeline with the NerDLApproach added. When x
is a tbl_spark
a transformed tbl_spark
(note that the Dataframe passed in must have the input_cols specified).
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.