This uses a simple model (xgboost or glm) to return a simple predictive score, where numbers closer to 1 are more positive and numbers closer to -1 are more negative. This can be used to determine whether the sentiment is positive or negative.

sentiment_score( x = NULL, model = names(default_models), scoring = c("xgb", "glm"), scoring_version = "1.0", batch_size = 100, ... )

`x` |
A plain text vector or column name if data is supplied. If you know what you're doing, you can also pass in a 512-D numeric embedding. |

`model` |
An embedding name from tensorflow-hub, some of which are "en" (english large or not) and "multi" (multi-lingual large or not). |

`scoring` |
Model to use for scoring the embedding matrix (currently either "xgb" or "glm"). |

`scoring_version` |
The scoring version to use, currently only 1.0, but other versions might be supported in the future. |

`batch_size` |
Size of batches to use. Larger numbers will be faster than smaller numbers, but do not exhaust your system memory! |

`...` |
Additional arguments passed to |

Uses simple preditive models on embeddings to provide probability of positive score (rescaled to -1:1 for consistency with other packages).

numeric vector of length(x) containing a re-scaled sentiment probabilities.

## Not run: envname <- "r-sentiment-ai" # make sure to install sentiment ai (install_sentiment.ai) # install_sentiment.ai(envname = envname, # method = "conda") # running the model mod_xgb <- sentiment_score(x = airline_tweets$text, model = "en.large", scoring = "xgb", envname = envname) mod_glm <- sentiment_score(x = airline_tweets$text, model = "en.large", scoring = "glm", envname = envname) # checking performance pos_neg <- factor(airline_tweets$airline_sentiment, levels = c("negative", "neutral", "positive")) pos_neg <- (as.numeric(pos_neg) - 1) / 2 cosine(mod_xgb, pos_neg) cosine(mod_glm, pos_neg) # you could also calculate accuracy/kappa ## End(Not run)

