nlp_drug_normalizer: Spark NLP DrugNormalizer
In r-spark/sparknlp: R Interface to John Snow Labs Spark NLP

nlp_drug_normalizer

R Documentation

Spark NLP DrugNormalizer

Description

Spark ML transformer that normalizes raw text from clinical documents, e.g. scraped web pages or xml documents, from document type columns into Sentence. Removes all dirty characters from text following one or more input regex patterns. Can apply non wanted character removal which a specific policy. Can apply lower case normalization. See https://nlp.johnsnowlabs.com/licensed/api/index.html#com.johnsnowlabs.nlp.annotators.DrugNormalizer

Usage

nlp_drug_normalizer(
  x,
  input_cols,
  output_col,
  lower_case = NULL,
  policy = NULL,
  uid = random_string("drug_normalizer_")
)

Arguments

`x`	A `spark_connection`, `ml_pipeline`, or a `tbl_spark`.
`input_cols`	Input columns. String array.
`output_col`	Output column. String.
`lower_case`	whether to convert strings to lowercase
`policy`	removalPolicy to remove patterns from text with a given policy
`uid`	A character string used to uniquely identify the ML estimator.

Value

The object returned depends on the class of x.

spark_connection: When x is a spark_connection, the function returns an instance of a ml_estimator object. The object contains a pointer to a Spark Estimator object and can be used to compose Pipeline objects.
ml_pipeline: When x is a ml_pipeline, the function returns a ml_pipeline with the NLP estimator appended to the pipeline.
tbl_spark: When x is a tbl_spark, an estimator is constructed then immediately fit with the input tbl_spark, returning an NLP model.