This notebook is adapted from John Snow Labs Jupyter/Python getting started notebook. See https://github.com/JohnSnowLabs/spark-nlp-workshop/blob/master/jupyter/quick_start.ipynb for that version.
Make sure you have already installed sparklyr and sparklnlp
library(dplyr) library(sparklyr) library(sparklyr.nested) library(sparknlp)
version <- Sys.getenv("SPARK_VERSION", unset = "2.4.3") config <- sparklyr::spark_config() options(sparklyr.sanitize.column.names.verbose = TRUE) options(sparklyr.verbose = TRUE) options(sparklyr.na.omit.verbose = TRUE) options(sparklyr.na.action.verbose = TRUE) sc <- sparklyr::spark_connect(master = "local", version = version, config = config) cat("Apache Spark version: ", sc$home_version)
Let's use Spark NLP pre-trained pipeline for named entity recognition
pipeline <- nlp_pretrained_pipeline(sc, "recognize_entities_dl", lang = "en")
result <- nlp_annotate(pipeline, "Google has announced the release of a beta version of the popular TensorFlow machine learning library.")
result) result %>% mutate(entity_type = ner.result) %>% pull(entity_type) %>% unlist()
result %>% mutate(named_entities = entities.result) %>% pull(named_entities) %>% unlist()
Let's use Spark NLP pre-trained pipeline for sentiment analysis
pipeline <- nlp_pretrained_pipeline(sc, "analyze_sentiment", "en")
result <- nlp_annotate(pipeline, "This is a very boring movie. I recommend others to awoid this movie is not good..")
result %>% mutate(sentiments = sentiment.result) %>% pull(sentiments) %>% unlist()
result %>% mutate(checked_words = checked.result) %>% pull(checked_words) %>% unlist()
The word awoid has been corrected to avoid by spell checker inside this pipeline
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.