as.dictionary: Coercion and checking functions for dictionary objects

Description Usage Arguments Value Examples

View source: R/dictionaries.R

Description

Convert a dictionary from a different format into a quanteda dictionary, or check to see if an object is a dictionary.

Usage

1
2
3
as.dictionary(x, format = c("tidytext"), separator = " ", tolower = FALSE)

is.dictionary(x)

Arguments

x

a dictionary-like object to be coerced or checked

format

input format for the object to be coerced to a dictionary; current legal values are a data.frame with the fields word and sentiment (as per the tidytext package)

separator

the character in between multi-word dictionary values. This defaults to " ".

tolower

if TRUE, convert all dictionary values to lowercase

Value

as.dictionary returns a quanteda dictionary object. This conversion function differs from the dictionary() constructor function in that it converts an existing object rather than creates one from components or from a file.

is.dictionary returns TRUE if an object is a quanteda dictionary.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
## Not run: 
data(sentiments, package = "tidytext")
as.dictionary(subset(sentiments, lexicon == "nrc"))
as.dictionary(subset(sentiments, lexicon == "bing"))
# to convert AFINN into polarities - adjust thresholds if desired
datafinn <- subset(sentiments, lexicon == "AFINN")
datafinn[["sentiment"]] <-
    with(datafinn,
         sentiment <- ifelse(score < 0, "negative",
                             ifelse(score > 0, "positive", "netural"))
    )
with(datafinn, table(score, sentiment))
as.dictionary(datafinn)

dat <- data.frame(
    word = c("Great", "Horrible"),
    sentiment = c("positive", "negative")
    )
as.dictionary(dat)
as.dictionary(dat, tolower = FALSE)

## End(Not run)

is.dictionary(dictionary(list(key1 = c("val1", "val2"), key2 = "val3")))
# [1] TRUE
is.dictionary(list(key1 = c("val1", "val2"), key2 = "val3"))
# [1] FALSE

koheiw/quanteda.core documentation built on Sept. 21, 2020, 3:44 p.m.