Description Usage Arguments Value Chaining See Also Examples
Approximate the profanity of text by grouping variable(s). For a
full description of the profanity detection algorithm see
profanity
. See profanity
for more details about the algorithm, the profanity/valence shifter keys
that can be passed into the function, and other arguments that can be passed.
1 | profanity_by(text.var, by = NULL, group.names, ...)
|
text.var |
The text variable. Also takes a |
by |
The grouping variable(s). Default |
group.names |
A vector of names that corresponds to group. Generally for internal use. |
... |
Other arguments passed to |
Returns a data.table with grouping variables plus:
element_id - The id number of the original vector passed to profanity
sentence_id - The id number of the sentences within each element_id
word_count - Word count sum
med by grouping variable
profanity_count - The number of profanities used by grouping variable
sd - Standard deviation (sd
) of the sentence level profanity rate by grouping variable
ave_profanity - Profanity rate
See the sentiment_by
for details about sentimentr chaining.
Other profanity functions:
profanity()
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 | ## Not run:
bw <- sample(lexicon::profanity_alvarez, 4)
mytext <- c(
sprintf('do you like this %s? It is %s. But I hate really bad dogs', bw[1], bw[2]),
'I am the best friend.',
NA,
sprintf('I %s hate this %s', bw[3], bw[4]),
"Do you really like it? I'm not happy"
)
## works on a character vector but not the preferred method avoiding the
## repeated cost of doing sentence boundary disambiguation every time
## `profanity` is run
profanity(mytext)
profanity_by(mytext)
## preferred method avoiding paying the cost
mytext <- get_sentences(mytext)
profanity_by(mytext)
get_sentences(profanity_by(mytext))
(myprofanity <- profanity_by(mytext))
stats::setNames(get_sentences(profanity_by(mytext)),
round(myprofanity[["ave_profanity"]], 3))
brady <- get_sentences(crowdflower_deflategate)
library(data.table)
bp <- profanity_by(brady)
crowdflower_deflategate[bp[ave_profanity > 0,]$element_id, ]
vulgars <- bp[["ave_profanity"]] > 0
stats::setNames(get_sentences(bp)[vulgars],
round(bp[["ave_profanity"]][vulgars], 3))
bt <- data.table(crowdflower_deflategate)[,
source := ifelse(grepl('^RT', text), 'retweet', 'OP')][,
belichick := grepl('\\bb[A-Za-z]+l[A-Za-z]*ch', text, ignore.case = TRUE)][]
prof_bel <- with(bt, profanity_by(text, by = list(source, belichick)))
plot(prof_bel)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.