rm_stopwords: Remove Stop Words

Description Usage Arguments Value See Also Examples

View source: R/rm_stopwords.R

Description

Removal of stop words in a variety of contexts .

%sw% - Binary operator version of rm_stopwords that defaults to separate = FALSE..

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
rm_stopwords(text.var, stopwords = qdapDictionaries::Top25Words,
  unlist = FALSE, separate = TRUE, strip = FALSE, unique = FALSE,
  char.keep = NULL, names = FALSE, ignore.case = TRUE,
  apostrophe.remove = FALSE, ...)

rm_stop(text.var, stopwords = qdapDictionaries::Top25Words, unlist = FALSE,
  separate = TRUE, strip = FALSE, unique = FALSE, char.keep = NULL,
  names = FALSE, ignore.case = TRUE, apostrophe.remove = FALSE, ...)

text.var %sw% stopwords

Arguments

text.var

A character string of text or a vector of character strings.

stopwords

A character vector of words to remove from the text. qdap has a number of data sets that can be used as stop words including: Top200Words, Top100Words, Top25Words. For the tm package's traditional English stop words use tm::stopwords("english").

unlist

logical. If TRUE unlists into one vector. General use intended for when separate is FALSE.

separate

logical. If TRUE separates sentences into words. If FALSE retains sentences.

strip

logical. IF TRUE strips the text of all punctuation except apostrophes.

unique

logical. If TRUE keeps only unique words (if unlist is TRUE) or sentences (if unlist is FALSE). General use intended for when unlist is TRUE.

char.keep

If strip is TRUE this argument provides a means of retaining supplied character(s).

names

logical. If TRUE will name the elements of the vector or list with the original text.var.

ignore.case

logical. If TRUE stopwords will be removed regardless of case. Additionally, case will be stripped from the text. If FALSE stop word removal is contingent upon case. Additionally, case is not stripped.

apostrophe.remove

logical. If TRUE removes apostrophe's from the output.

...

further arguments passed to strip function.

Value

Returns a vector of sentences, vector of words, or (default) a list of vectors of words with stop words removed. Output depends on supplied arguments.

See Also

strip, bag_o_words, stopwords

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
## Not run: 
rm_stopwords(DATA$state)
rm_stopwords(DATA$state, tm::stopwords("english"))
rm_stopwords(DATA$state, Top200Words)
rm_stopwords(DATA$state, Top200Words, strip = TRUE)
rm_stopwords(DATA$state, Top200Words, separate = FALSE)
rm_stopwords(DATA$state, Top200Words, separate = FALSE, ignore.case = FALSE)
rm_stopwords(DATA$state, Top200Words, unlist = TRUE)
rm_stopwords(DATA$state, Top200Words, unlist = TRUE, strip=TRUE)
rm_stop(DATA$state, Top200Words, unlist = TRUE, unique = TRUE)

c("I like it alot", "I like it too") %sw% qdapDictionaries::Top25Words

## End(Not run)

Example output

Loading required package: qdapDictionaries
Loading required package: qdapRegex
Loading required package: qdapTools
Loading required package: RColorBrewer
OpenJDK 64-Bit Server VM warning: Can't detect initial thread stack location - find_vma failed

Attaching package: 'qdap'

The following object is masked from 'package:base':

    Filter

[[1]]
[1] "computer" "fun"      "."        "not"      "too"      "fun"      "."       

[[2]]
[1] "no"   "it's" "not"  ","    "it's" "dumb" "."   

[[3]]
[1] "what"   "should" "we"     "do"     "?"     

[[4]]
[1] "liar"   ","      "stinks" "!"     

[[5]]
[1] "am"      "telling" "truth"   "!"      

[[6]]
[1] "how"     "can"     "we"      "certain" "?"      

[[7]]
[1] "there" "no"    "way"   "."    

[[8]]
[1] "distrust" "."       

[[9]]
[1] "what"    "talking" "about"   "?"      

[[10]]
[1] "shall" "we"    "move"  "?"     "good"  "then"  "."    

[[11]]
[1] "i'm"     "hungry"  "."       "let's"   "eat"     "."       "already"
[8] "?"      

[[1]]
[1] "computer" "fun"      "."        "fun"      "."       

[[2]]
[1] ","    "dumb" "."   

[[3]]
[1] "?"

[[4]]
[1] "liar"   ","      "stinks" "!"     

[[5]]
[1] "telling" "truth"   "!"      

[[6]]
[1] "can"     "certain" "?"      

[[7]]
[1] "way" "."  

[[8]]
[1] "distrust" "."       

[[9]]
[1] "talking" "?"      

[[10]]
[1] "shall" "move"  "?"     "good"  "."    

[[11]]
[1] "hungry"  "."       "eat"     "."       "already" "?"      

[[1]]
[1] "computer" "fun"      "."        "fun"      "."       

[[2]]
[1] "it's" ","    "it's" "dumb" "."   

[[3]]
[1] "?"

[[4]]
[1] "liar"   ","      "stinks" "!"     

[[5]]
[1] "am"      "telling" "truth"   "!"      

[[6]]
[1] "certain" "?"      

[[7]]
[1] "."

[[8]]
[1] "distrust" "."       

[[9]]
[1] "talking" "?"      

[[10]]
[1] "shall" "?"     "."    

[[11]]
[1] "i'm"     "hungry"  "."       "let's"   "eat"     "."       "already"
[8] "?"      

[[1]]
[1] "computer" "fun"      "fun"     

[[2]]
[1] "it's" "it's" "dumb"

[[3]]
character(0)

[[4]]
[1] "liar"   "stinks"

[[5]]
[1] "am"      "telling" "truth"  

[[6]]
[1] "certain"

[[7]]
character(0)

[[8]]
[1] "distrust"

[[9]]
[1] "talking"

[[10]]
[1] "shall"

[[11]]
[1] "i'm"     "hungry"  "let's"   "eat"     "already"

 [1] "computer fun. fun."              "it's, it's dumb."               
 [3] "?"                               "liar, stinks!"                  
 [5] "am telling truth!"               "certain?"                       
 [7] "."                               "distrust."                      
 [9] "talking?"                        "shall?."                        
[11] "i'm hungry. let's eat. already?"
 [1] "Computer fun. Not fun."              "No it's, it's dumb."                
 [3] "What?"                               "You liar, stinks!"                  
 [5] "am telling truth!"                   "How certain?"                       
 [7] "There."                              "distrust."                          
 [9] "What talking?"                       "Shall? Good."                       
[11] "I'm hungry. Let's eat. You already?"
 [1] "computer" "fun"      "."        "fun"      "."        "it's"    
 [7] ","        "it's"     "dumb"     "."        "?"        "liar"    
[13] ","        "stinks"   "!"        "am"       "telling"  "truth"   
[19] "!"        "certain"  "?"        "."        "distrust" "."       
[25] "talking"  "?"        "shall"    "?"        "."        "i'm"     
[31] "hungry"   "."        "let's"    "eat"      "."        "already" 
[37] "?"       
 [1] "computer" "fun"      "fun"      "it's"     "it's"     "dumb"    
 [7] "liar"     "stinks"   "am"       "telling"  "truth"    "certain" 
[13] "distrust" "talking"  "shall"    "i'm"      "hungry"   "let's"   
[19] "eat"      "already" 
 [1] "computer" "fun"      "."        "it's"     ","        "dumb"    
 [7] "?"        "liar"     "stinks"   "!"        "am"       "telling" 
[13] "truth"    "certain"  "distrust" "talking"  "shall"    "i'm"     
[19] "hungry"   "let's"    "eat"      "already" 
[1] "like alot" "like too" 
Warning message:
system call failed: Cannot allocate memory 

qdap documentation built on Nov. 20, 2017, 5:09 p.m.