Description Usage Arguments Value Note Examples
wfm
- Generate a word frequency matrix by grouping
variable(s).
wfdf
- Generate a word frequency data frame by
grouping variable.
wfm.expanded
- Expand a word frequency matrix to
have multiple rows for each word.
wf.combine
- Combines words (rows) of a word
frequency data frame (wfdf
) together.
1 2 3 4 5 6 7 8 9 10 11 | wfm(text.var = NULL, grouping.var = NULL, wfdf = NULL,
output = "raw", stopwords = NULL, digits = 2,
char2space = "~~", ...)
wfdf(text.var, grouping.var = NULL, stopwords = NULL,
margins = FALSE, output = "raw", digits = 2,
char2space = "~~", ...)
wfm.expanded(text.var, grouping.var = NULL, ...)
wf.combine(wf.obj, word.lists, matrix = FALSE)
|
text.var |
The text variable |
grouping.var |
The grouping variables. Default NULL generates one word list for all text. Also takes a single grouping variable or a list of 1 or more grouping variables. |
wfdf |
A word frequency data frame given instead of
raw text.var and optional grouping.var. Basically
converts a word frequency dataframe (wfdf) to a word
frequency matrix ( |
output |
Output type (either |
stopwords |
A vector of stop words to remove. |
digits |
An integer indicating the number of decimal places (round) or significant digits (signif) to be used. Negative values are allowed |
margins |
logical. If TRUE provides grouping.var and word variable totals. |
... |
Other arguments supplied to
|
wf.obj |
A |
word.lists |
A list of character vectors of words to
pass to |
matrix |
logical. If TRUE returns the output as a
|
char2space |
A vector of characters to be turned
into spaces. If |
wfm
- returns a word frequency of the class
matrix.
wfdf
- returns a word frequency of the class
data.frame with a words column and optional margin sums.
wfm.expanded
- returns a matrix similar to a word
frequency matrix (wfm
) but the rows are expanded
to represent the maximum usages of the word and cells are
dummy coded to indicate that number of uses.
wf.combine
- returns a word frequency matrix
(wfm
) or dataframe (wfdf
) with counts for
the combined word.lists merged and remaining terms(else).
Words can be kept as one by inserting a double tilde
("~~"
), or other character strings passed to
char2space, as a single word/entry. This is useful for
keeping proper names as a single unit.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 | #word frequency matrix (wfm) example:
with(DATA, wfm(state, list(sex, adult)))[1:15, ]
with(DATA, wfm(state, person))[1:15, ]
#insert double tilde ("~~") to keep phrases(i.e. first last name)
alts <- c(" fun", "I ")
state2 <- mgsub(alts, gsub("\\s", "~~", alts), DATA$state)
with(DATA, wfm(state2, list(sex, adult)))[1:18, ]
#word frequency dataframe (wfdf) example:
with(DATA, wfdf(state, list(sex, adult)))[1:15, ]
with(DATA, wfdf(state, person))[1:15, ]
#inset double tilde ("~~") to keep dual words (e.i. first last name)
alts <- c(" fun", "I ")
state2 <- mgsub(alts, gsub("\\s", "~~", alts), DATA$state)
with(DATA, wfdf(state2, list(sex, adult)))[1:18, ]
#wfm.expanded example:
z <- wfm(DATA$state, DATA$person)
wfm.expanded(z)[30:45, ] #two "you"s
#wf.combine examples:
#===================
#raw no margins (will work)
x <- wfm(DATA$state, DATA$person)
#raw with margin (will work)
y <- wfdf(DATA$state, DATA$person, margins = TRUE)
WL1 <- c(y[, 1])
WL2 <- list(c("read", "the", "a"), c("you", "your", "you're"))
WL3 <- list(bob = c("read", "the", "a"), yous = c("you", "your", "you're"))
WL4 <- list(bob = c("read", "the", "a"), yous = c("a", "you", "your", "your're"))
WL5 <- list(yous = c("you", "your", "your're"))
WL6 <- list(c("you", "your", "your're")) #no name so will be called words 1
WL7 <- c("you", "your", "your're")
wf.combine(z, WL2) #Won't work not a raw frequency matrix
wf.combine(x, WL2) #Works (raw and no margins)
wf.combine(y, WL2) #Works (raw with margins)
wf.combine(y, c("you", "your", "your're"))
wf.combine(y, WL1)
wf.combine(y, WL3)
## wf.combine(y, WL4) #Error
wf.combine(y, WL5)
wf.combine(y, WL6)
wf.combine(y, WL7)
worlis <- c("you", "it", "it's", "no", "not", "we")
y <- wfdf(DATA$state, list(DATA$sex, DATA$adult), margins = TRUE)
z <- wf.combine(y, worlis, matrix = TRUE)
chisq.test(z)
chisq.test(wfm(wfdf = y))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.