Description Usage Arguments Value Warning Note See Also Examples
termco
- Search a transcript by any number of grouping variables for
categories (themes) of grouped root terms. While there are other termco
functions in the termco family (e.g., termco_d
)
termco
is a more powerful and flexible wrapper intended for general
use.
termco_d
- Search a transcript by any number of grouping variables for
root terms.
term_match
- Search a transcript for words that exactly match term(s).
termco2mat
- Convert a termco dataframe to a matrix for use with
visualization functions (e.g., heatmap.2
).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 | termco(
text.var,
grouping.var = NULL,
match.list,
short.term = TRUE,
ignore.case = TRUE,
elim.old = TRUE,
percent = TRUE,
digits = 2,
apostrophe.remove = FALSE,
char.keep = NULL,
digit.remove = NULL,
zero.replace = 0,
...
)
termco_d(
text.var,
grouping.var = NULL,
match.string,
short.term = FALSE,
ignore.case = TRUE,
zero.replace = 0,
percent = TRUE,
digits = 2,
apostrophe.remove = FALSE,
char.keep = NULL,
digit.remove = TRUE,
...
)
term_match(text.var, terms, return.list = TRUE, apostrophe.remove = FALSE)
termco2mat(
dataframe,
drop.wc = TRUE,
short.term = TRUE,
rm.zerocol = FALSE,
no.quote = TRUE,
transform = TRUE,
trim.terms = TRUE
)
|
text.var |
The text variable. |
grouping.var |
The grouping variables. Default |
match.list |
A list of named character vectors. |
short.term |
logical. If |
ignore.case |
logical. If |
elim.old |
logical. If |
percent |
logical. If |
digits |
Integer; number of decimal places to round when printing. |
apostrophe.remove |
logical. If |
char.keep |
A character vector of symbol character (i.e., punctuation)
that strip should keep. The default is to strip everything except
apostrophes. |
digit.remove |
logical. If |
zero.replace |
Value to replace 0 values with. |
match.string |
A vector of terms to search for. When using inside of
|
terms |
The terms to search for in the |
return.list |
logical. If |
dataframe |
A termco (or termco_d) dataframe or object. |
drop.wc |
logical. If |
rm.zerocol |
logical. If |
no.quote |
logical. If |
transform |
logical. If |
trim.terms |
logical. If |
... |
Other argument supplied to |
termco
& termco_d
- both return a list, of class
"termco", of data frames and information regarding word counts:
raw |
raw word counts by grouping variable |
prop |
proportional word counts by grouping variable; proportional to each individual's word use |
rnp |
a character combination data frame of raw and proportional |
zero_replace |
value to replace zeros with; mostly internal use |
percent |
The value of percent used for plotting purposes. |
digits |
integer value of number of digits to display; mostly internal use |
term_match
- returns a list or vector of possible words that
match term(s).
termco2mat
- returns a matrix of term counts.
Percentages are calculated as a ratio of counts of
match.list
elements to word counts. Word counts do not contain
symbols or digits. Using symbols, digits or small segments of full words
(e.g., "to") could total more than 100%.
The match.list/match.string is (optionally) case and character sensitive. Spacing is an important way to grab specific words and requires careful thought. Using "read" will find the words "bread", "read" "reading", and "ready". If you want to search for just the word "read" you'd supply a vector of c(" read ", " reads", " reading", " reader"). To search for non character arguments (i.e., numbers and symbols) additional arguments from strip must be passed.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 | ## Not run:
#termco examples:
term <- c("the ", "she", " wh")
(out <- with(raj.act.1, termco(dialogue, person, term)))
plot(out)
scores(out)
plot(scores(out))
counts(out)
plot(counts(out))
proportions(out)
plot(proportions(out))
# General form for match.list as themes
#
# ml <- list(
# cat1 = c(),
# cat2 = c(),
# catn = c()
# )
ml <- list(
cat1 = c(" the ", " a ", " an "),
cat2 = c(" I'" ),
"good",
the = c("the", " the ", " the", "the")
)
(dat <- with(raj.act.1, termco(dialogue, person, ml)))
scores(dat) #useful for presenting in tables
counts(dat) #prop and raw counts are useful for performing calculations
proportions(dat)
datb <- with(raj.act.1, termco(dialogue, person, ml,
short.term = FALSE, elim.old=FALSE))
ltruncdf(datb, 20, 6)
(dat2 <- data.frame(dialogue=c("@bryan is bryan good @br",
"indeed", "@ brian"), person=qcv(A, B, A)))
ml2 <- list(wrds=c("bryan", "indeed"), "@", bryan=c("bryan", "@ br", "@br"))
with(dat2, termco(dialogue, person, match.list=ml2))
with(dat2, termco(dialogue, person, match.list=ml2, percent = FALSE))
DATA$state[1] <- "12 4 rgfr r0ffrg0"
termco(DATA$state, DATA$person, '0', digit.remove=FALSE)
DATA <- qdap::DATA
#Using with term_match and exclude
exclude(term_match(DATA$state, qcv(th), FALSE), "truth")
termco(DATA$state, DATA$person, exclude(term_match(DATA$state, qcv(th),
FALSE), "truth"))
MTCH.LST <- exclude(term_match(DATA$state, qcv(th, i)), qcv(truth, stinks))
termco(DATA$state, DATA$person, MTCH.LST)
syns <- synonyms("doubt")
syns[1]
termco(DATA$state, DATA$person, unlist(syns[1]))
synonyms("doubt", FALSE)
termco(DATA$state, DATA$person, list(doubt = synonyms("doubt", FALSE)))
termco(DATA$state, DATA$person, syns)
#termco_d examples:
termco_d(DATA$state, DATA$person, c(" the", " i'"))
termco_d(DATA$state, DATA$person, c(" the", " i'"), ignore.case=FALSE)
termco_d(DATA$state, DATA$person, c(" the ", " i'"))
# termco2mat example:
MTCH.LST <- exclude(term_match(DATA$state, qcv(a, i)), qcv(is, it, am, shall))
termco_obj <- termco(DATA$state, DATA$person, MTCH.LST)
termco2mat(termco_obj)
plot(termco_obj)
plot(termco_obj, label = TRUE)
plot(termco_obj, label = TRUE, text.color = "red")
plot(termco_obj, label = TRUE, text.color="red", lab.digits=3)
## REVERSE TERMCO (return raw words found per variable)
df <- data.frame(x=1:6,
y = c("the fluffy little bat" , "the man was round like a ball",
"the fluffy little bat" , "the man was round like a ball",
"he ate the chair" , "cough, cough"),
stringsAsFactors=FALSE)
l <- list("bat" ,"man", "ball", "heavy")
z <- counts(termco(df$y, qdapTools::id(df), l))[, -2]
counts2list(z[, -1], z[, 1])
## politness
politness <- c("please", "excuse me", "thank you", "you welcome",
"you're welcome", "i'm sorry", "forgive me", "pardon me")
with(pres_debates2012, termco(dialogue, person, politness))
with(hamlet, termco(dialogue, person, politness))
## Term Use Percentage per N Words
dat <- with(raj, chunker(dialogue, person, n.words = 100, rm.unequal = TRUE))
dat2 <- list2df(dat, "Dialogue", "Person")
dat2[["Duration"]] <- unlist(lapply(dat, id, pad=FALSE))
dat2 <- qdap_df(dat2, "Dialogue")
Top5 <- sapply(split(raj$dialogue, raj$person), wc, FALSE) %>%
sort(decreasing=TRUE) %>%
list2df("wordcount", "person") %>%
`[`(1:5, 2)
propdat <- dat2 %&%
termco(list(Person, Duration), as.list(Top25Words[1:5]), percent = FALSE) %>%
proportions %>%
colsplit2df %>%
reshape2::melt(id=c("Person", "Duration", "word.count"), variable="Word") %>%
dplyr::filter(Person %in% Top5)
head(propdat)
ggplot(propdat, aes(y=value, x=Duration, group=Person, color=Person)) +
geom_line(size=1.25) +
facet_grid(Word~., scales="free_y") +
ylab("Percent of Word Use") +
xlab("Per 100 Words") +
scale_y_continuous(labels = percent)
ggplot(propdat, aes(y=value, x=Duration, group=Word, color=Word)) +
geom_line(size=1.25) +
facet_grid(Person~.) +
ylab("Percent of Word Use") +
xlab("Per 100 Words") +
scale_y_continuous(labels = percent)
ggplot(propdat, aes(y=value, x=Duration, group=Word)) +
geom_line() +
facet_grid(Word~Person, scales="free_y") +
ylab("Percent of Word Use") +
xlab("Per 100 Words") +
scale_y_continuous(labels = percent) +
ggthemes::theme_few()
## Discourse Markers: See...
## Schffrin, D. (2001). Discourse markers: Language, meaning, and context.
## In D. Schiffrin, D. Tannen, & H. E. Hamilton (Eds.), The handbook of
## discourse analysis (pp. 54-75). Malden, MA: Blackwell Publishing.
discoure_markers <- list(
response_cries = c(" oh ", " ah ", " aha ", " ouch ", " yuk "),
back_channels = c(" uh-huh ", " uhuh ", " yeah "),
summons = " hey ",
justification = " because "
)
(markers <- with(pres_debates2012,
termco(dialogue, list(person, time), discoure_markers)
))
plot(markers, high="red")
with(pres_debates2012,
termco(dialogue, list(person, time), discoure_markers, elim.old = FALSE)
)
with(pres_debates2012,
dispersion_plot(dialogue, unlist(discoure_markers), person, time)
)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.