Description Usage Arguments Value Examples
View source: R/extract_profanity_terms.R
Extract the profanity words from a text.
1 2 3 4 5 | extract_profanity_terms(
text.var,
profanity_list = unique(tolower(lexicon::profanity_alvarez)),
...
)
|
text.var |
The text variable. Can be a |
profanity_list |
A atomic character vector of profane words. The lexicon package has lists that can be used, including:
|
... |
Ignored. |
Returns a data.table with a columns of profane terms.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | ## Not run:
bw <- sample(lexicon::profanity_alvarez, 4)
mytext <- c(
sprintf('do you %s like this %s? It is %s. But I hate really bad dogs', bw[1], bw[2], bw[3]),
'I am the best friend.',
NA,
sprintf('I %s hate this %s', bw[3], bw[4]),
"Do you really like it? I'm not happy"
)
x <- get_sentences(mytext)
profanity(x)
prof_words <- extract_profanity_terms(x)
prof_words
prof_words$sentence
prof_words$neutral
prof_words$profanity
data.table::as.data.table(prof_words)
attributes(extract_profanity_terms(x))$counts
attributes(extract_profanity_terms(x))$elements
brady <- get_sentences(crowdflower_deflategate)
brady_swears <- extract_profanity_terms(brady)
attributes(extract_profanity_terms(brady))$counts
attributes(extract_profanity_terms(brady))$elements
## End(Not run)
|
element_id sentence_id word_count profanity_count profanity
1: 1 1 6 2 0.3333333
2: 1 2 3 1 0.3333333
3: 1 3 6 0 0.0000000
4: 2 1 5 0 0.0000000
5: 3 1 NA 0 0.0000000
6: 4 1 5 2 0.4000000
7: 5 1 5 0 0.0000000
8: 5 2 3 0 0.0000000
element_id sentence_id profanity
1: 1 1 sh1t,sh1tz
2: 1 2 feg
3: 1 3
4: 2 1
5: 3 1
6: 4 1 feg,schaffer
7: 5 1
8: 5 2
[1] "do you sh1t like this sh1tz?" "It is feg."
[3] "But I hate really bad dogs" "I am the best friend."
[5] NA "I feg hate this schaffer"
[7] "Do you really like it?" "I'm not happy"
[[1]]
[1] "do" "like" "this" "you"
[[2]]
[1] "is" "it"
[[3]]
[1] "bad" "but" "dogs" "hate" "i" "really"
[[4]]
[1] "am" "best" "friend" "i" "the"
[[5]]
[1] NA
[[6]]
[1] "hate" "i" "this"
[[7]]
[1] "do" "it" "like" "really" "you"
[[8]]
[1] "happy" "i'm" "not"
[[1]]
[1] "sh1t" "sh1tz"
[[2]]
[1] "feg"
[[3]]
character(0)
[[4]]
character(0)
[[5]]
character(0)
[[6]]
[1] "feg" "schaffer"
[[7]]
character(0)
[[8]]
character(0)
element_id sentence_id neutral profanity
1: 1 1 do,like,this,you sh1t,sh1tz
2: 1 2 is,it feg
3: 1 3 bad,but,dogs,hate,i,really
4: 2 1 am,best,friend,i,the
5: 3 1 NA
6: 4 1 hate,i,this feg,schaffer
7: 5 1 do,it,like,really,you
8: 5 2 happy,i'm,not
sentence
1: do you sh1t like this sh1tz?
2: It is feg.
3: But I hate really bad dogs
4: I am the best friend.
5: <NA>
6: I feg hate this schaffer
7: Do you really like it?
8: I'm not happy
words profanity n
1: feg 1 2
2: schaffer 1 1
3: sh1t 1 1
4: sh1tz 1 1
5: i 0 3
6: do 0 2
7: hate 0 2
8: it 0 2
9: like 0 2
10: really 0 2
11: this 0 2
12: you 0 2
13: <NA> 0 1
14: am 0 1
15: bad 0 1
16: best 0 1
17: but 0 1
18: dogs 0 1
19: friend 0 1
20: happy 0 1
21: i'm 0 1
22: is 0 1
23: not 0 1
24: the 0 1
words profanity n
element_id sentence_id words profanity
1: 3 1 <NA> 0
2: 2 1 am 0
3: 1 3 bad 0
4: 2 1 best 0
5: 1 3 but 0
6: 1 1 do 0
7: 5 1 do 0
8: 1 3 dogs 0
9: 1 2 feg 1
10: 4 1 feg 1
11: 2 1 friend 0
12: 5 2 happy 0
13: 1 3 hate 0
14: 4 1 hate 0
15: 1 3 i 0
16: 2 1 i 0
17: 4 1 i 0
18: 5 2 i'm 0
19: 1 2 is 0
20: 1 2 it 0
21: 5 1 it 0
22: 1 1 like 0
23: 5 1 like 0
24: 5 2 not 0
25: 1 3 really 0
26: 5 1 really 0
27: 4 1 schaffer 1
28: 1 1 sh1t 1
29: 1 1 sh1tz 1
30: 2 1 the 0
31: 1 1 this 0
32: 4 1 this 0
33: 1 1 you 0
34: 5 1 you 0
element_id sentence_id words profanity
words profanity n
1: shit 1 51
2: fuck 1 45
3: ass 1 39
4: fucking 1 23
5: crap 1 16
---
26467: ~2 0 1
26468: ~jesse 0 1
26469: ~reed 0 1
26470: ~tb 0 1
26471: ~tom 0 1
element_id sentence_id words profanity
1: 405 1 0
2: 426 1 0
3: 663 1 0
4: 672 1 0
5: 760 1 0
---
187038: 17612 1 ~2 0
187039: 11943 1 ~jesse 0
187040: 6400 1 ~reed 0
187041: 7815 1 ~tb 0
187042: 12905 1 ~tom 0
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.