Simple Sentiment Scores


This uses a simple model (xgboost or glm) to return a simple predictive score, where numbers closer to 1 are more positive and numbers closer to -1 are more negative. This can be used to determine whether the sentiment is positive or negative.


  x = NULL,
  model = names(default_models),
  scoring = c("xgb", "glm"),
  scoring_version = "1.0",
  batch_size = 100,



A plain text vector or column name if data is supplied. If you know what you're doing, you can also pass in a 512-D numeric embedding.


An embedding name from tensorflow-hub, some of which are "en" (english large or not) and "multi" (multi-lingual large or not).


Model to use for scoring the embedding matrix (currently either "xgb" or "glm").


The scoring version to use, currently only 1.0, but other versions might be supported in the future.


Size of batches to use. Larger numbers will be faster than smaller numbers, but do not exhaust your system memory!


Additional arguments passed to conda_install() or virtualenv_install().


Uses simple preditive models on embeddings to provide probability of positive score (rescaled to -1:1 for consistency with other packages).


numeric vector of length(x) containing a re-scaled sentiment probabilities.


## Not run: 
envname <- "r-sentiment-ai"

# make sure to install sentiment ai (
# = envname,
#                      method  = "conda")

# running the model
mod_xgb <- sentiment_score(x       = airline_tweets$text,
                           model   = "en.large",
                           scoring = "xgb",
                           envname = envname)
mod_glm <- sentiment_score(x       = airline_tweets$text,
                           model   = "en.large",
                           scoring = "glm",
                           envname = envname)

# checking performance
pos_neg <- factor(airline_tweets$airline_sentiment,
                  levels = c("negative", "neutral", "positive"))
pos_neg <- (as.numeric(pos_neg) - 1) / 2
cosine(mod_xgb, pos_neg)
cosine(mod_glm, pos_neg)

# you could also calculate accuracy/kappa

