txt_sentiment: Perform dictionary-based sentiment analysis on a tokenised...
In udpipe: Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing with the 'UDPipe' 'NLP' Toolkit

txt_sentiment

R Documentation

Perform dictionary-based sentiment analysis on a tokenised data frame

Description

This function identifies words which have a positive/negative meaning, with the addition of some basic logic regarding occurrences of amplifiers/deamplifiers and negators in the neighbourhood of the word which has a positive/negative meaning.

If a negator is occurring in the neigbourhood, positive becomes negative or vice versa.
If amplifiers/deamplifiers occur in the neigbourhood, these amplifier weight is added to the sentiment polarity score.

This function took inspiration from qdap::polarity but was completely re-engineered to allow to calculate similar things on a udpipe-tokenised dataset. It works on a sentence level and the negator/amplification logic can not surpass a boundary defined by the PUNCT upos parts of speech tag.

Note that if you prefer to build a supervised model to perform sentiment scoring you might be interested in looking at the ruimtehol R package https://github.com/bnosac/ruimtehol instead.

Usage

txt_sentiment(
  x,
  term = "lemma",
  polarity_terms,
  polarity_negators = character(),
  polarity_amplifiers = character(),
  polarity_deamplifiers = character(),
  amplifier_weight = 0.8,
  n_before = 4,
  n_after = 2,
  constrain = FALSE
)

Arguments

`x`	a data.frame with the columns doc_id, paragraph_id, sentence_id, upos and the column as indicated in `term`. This is exactly what `udpipe` returns.
`term`	a character string with the name of a column of `x` where you want to apply to sentiment scoring upon
`polarity_terms`	data.frame containing terms which have positive or negative meaning. This data frame should contain the columns term and polarity where term is of type character and polarity can either be 1 or -1.
`polarity_negators`	a character vector of words which will invert the meaning of the `polarity_terms` such that -1 becomes 1 and vice versa
`polarity_amplifiers`	a character vector of words which amplify the `polarity_terms`
`polarity_deamplifiers`	a character vector of words which deamplify the `polarity_terms`
`amplifier_weight`	weight which is added to the polarity score if an amplifier occurs in the neighbourhood
`n_before`	integer indicating how many words before the `polarity_terms` word one has to look to find negators/amplifiers/deamplifiers to apply its logic
`n_after`	integer indicating how many words after the `polarity_terms` word one has to look to find negators/amplifiers/deamplifiers to apply its logic
`constrain`	logical indicating to make sure the aggregated sentiment scores is between -1 and 1

Value

a list containing

data: the x data.frame with 2 columns added: polarity and sentiment_polarity.
- The column polarity being just the polarity column of the polarity_terms dataset corresponding to the polarity of the term you apply the sentiment scoring
- The colummn sentiment_polarity is the value where the amplifier/de-amplifier/negator logic is applied on.
overall: a data.frame with one row per doc_id containing the columns doc_id, sentences, terms, sentiment_polarity, terms_positive, terms_negative, terms_negation and terms_amplification providing the aggregate sentiment_polarity score of the dataset x by doc_id as well as the terminology causing the sentiment, the number of sentences and the number of non punctuation terms in the document.

Examples

x <- c("I do not like whatsoever when an R package has soo many dependencies.",
       "Making other people install java is annoying, 
        as it is a really painful experience in classrooms.")
## Not run: 
## Do the annotation to get the data.frame needed as input to txt_sentiment
anno <- udpipe(x, "english-gum")

## End(Not run)
anno <- data.frame(doc_id = c(rep("doc1", 14), rep("doc2", 18)), 
                   paragraph_id = 1,
                   sentence_id = 1,
                   lemma = c("I", "do", "not", "like", "whatsoever", 
                             "when", "an", "R", "package", 
                             "has", "soo", "many", "dependencies", ".", 
                             "Making", "other", "people", "install", 
                             "java", "is", "annoying", ",", "as", 
                             "it", "is", "a", "really", "painful", 
                             "experience", "in", "classrooms", "."),
                   upos = c("PRON", "AUX", "PART", "VERB", "PRON", 
                            "SCONJ", "DET", "PROPN", "NOUN", "VERB", 
                             "ADV", "ADJ", "NOUN", "PUNCT", 
                             "VERB", "ADJ", "NOUN", "ADJ", "NOUN", 
                             "AUX", "VERB", "PUNCT", "SCONJ", "PRON", 
                             "AUX", "DET", "ADV", "ADJ", "NOUN", 
                             "ADP", "NOUN", "PUNCT"),
                   stringsasFactors = FALSE)
scores <- txt_sentiment(x = anno, 
              term = "lemma",
              polarity_terms = data.frame(term = c("annoy", "like", "painful"), 
                                          polarity = c(-1, 1, -1)), 
              polarity_negators = c("not", "neither"),
              polarity_amplifiers = c("pretty", "many", "really", "whatsoever"), 
              polarity_deamplifiers = c("slightly", "somewhat"))
scores$overall
scores$data
scores <- txt_sentiment(x = anno, 
              term = "lemma",
              polarity_terms = data.frame(term = c("annoy", "like", "painful"), 
                                          polarity = c(-1, 1, -1)), 
              polarity_negators = c("not", "neither"),
              polarity_amplifiers = c("pretty", "many", "really", "whatsoever"), 
              polarity_deamplifiers = c("slightly", "somewhat"),
              constrain = TRUE, n_before = 4,
              n_after = 2, amplifier_weight = .8)
scores$overall
scores$data

udpipe documentation built on Nov. 26, 2025, 5:07 p.m.

udpipe index

README.md UDPipe Natural Language Processing - Basic Analytical Use Cases" UDPipe Natural Language Processing - Model Building" UDPipe Natural Language Processing - Parallel" UDPipe Natural Language Processing - Text Annotation" UDPipe Natural Language Processing - Topic Modelling Use Cases" UDPipe Natural Language Processing - Try it out" UDPipe Natural Language Processing - Universe"

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

udpipe
Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing with the 'UDPipe' 'NLP' Toolkit

txt_sentiment: Perform dictionary-based sentiment analysis on a tokenised...
In udpipe: Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing with the 'UDPipe' 'NLP' Toolkit

Perform dictionary-based sentiment analysis on a tokenised data frame

Description

Usage

Arguments

Value

Examples

Related to txt_sentiment in udpipe...

R Package Documentation

Browse R Packages

We want your feedback!

udpipe Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing with the 'UDPipe' 'NLP' Toolkit

txt_sentiment: Perform dictionary-based sentiment analysis on a tokenised... In udpipe: Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing with the 'UDPipe' 'NLP' Toolkit

Perform dictionary-based sentiment analysis on a tokenised data frame

Description

Usage

Arguments

Value

Examples

Related to txt_sentiment in udpipe...

R Package Documentation

Browse R Packages

We want your feedback!

udpipe
Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing with the 'UDPipe' 'NLP' Toolkit

txt_sentiment: Perform dictionary-based sentiment analysis on a tokenised...
In udpipe: Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing with the 'UDPipe' 'NLP' Toolkit