Description Usage Arguments Value Segmentation Examples
Analyze a set of texts to produce a dataset of percentages and other quantities describing the text, similar to the functionality supplied by the Linguistic Inquiry and Word Count standalone software distributed at http://liwc.wpengine.com.
1 2 3 4 5 6 7 8 |
x |
input object, a quanteda corpus or character vector for analysis |
... |
options passed to |
dictionary |
a quanteda dictionary object supplied for analysis |
tolower |
convert to common (lowser) case before tokenizing |
verbose |
if |
a data.frame object containing the analytic results, one row per document supplied
The LIWC standalone software has many options for segmenting the text. While this function does not supply segmentation options, you can easily achieve the same effect by converting the input object into a corpus (if it is not already a corpus) and using tokens to split the input texts into smaller units based on user-supplied tags, sentence, or paragraph boundaries.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | liwcalike(data_char_testphrases)
# examples for comparison
txt <- c("The red-shirted lawyer gave her yellow-haired, red nose ex-boyfriend $300
out of pity:(.")
myDict <- quanteda::dictionary(list(people = c("lawyer", "boyfriend"),
colorFixed = "red",
colorGlob = c("red*", "yellow*", "green*"),
mwe = "out of"))
liwcalike(txt, myDict, what = "word")
liwcalike(txt, myDict, what = "fasterword")
(toks <- quanteda::tokens(txt, what = "fasterword", removeHyphens = TRUE))
length(toks[[1]])
# LIWC says 12 words
## Not run: # works with LIWC 2015 dictionary too
liwc2015dict <- dictionary(file = "~/Dropbox/QUANTESS/dictionaries/LIWC/LIWC2015_English_Flat.dic",
format = "LIWC")
inaugLIWCanalysis <- liwcalike(data_corpus_inaugural, liwc2015dict)
inaugLIWCanalysis[1:6, 1:10]
## docname Segment WC WPS Sixltr Dic function article relativ motion
## 1 1789-Washington 1 1540 62.21739 24.35 253.1 52.403 9.0909 101.361 0.3483
## 2 1793-Washington 2 147 33.75000 25.17 250.3 5.065 0.9091 10.884 0.0387
## 3 1797-Adams 3 2584 62.72973 24.61 237.5 82.403 15.0649 163.946 0.3096
## 4 1801-Jefferson 4 1935 42.19512 20.36 253.2 62.143 10.0000 105.442 0.7353
## 5 1805-Jefferson 5 2381 48.13333 22.97 255.8 79.221 10.9091 151.701 0.6966
## 6 1809-Madison 6 1267 56.04762 24.78 258.2 42.987 8.3117 83.673 0.3870
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.