Strip Text

Description

Strip text of unwanted characters.

strip.character - factor method for strip.

strip.factor - factor method for strip.

strip.default - factor method for strip.

strip.list - factor method for strip.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
strip(x, char.keep = "~~", digit.remove = TRUE, apostrophe.remove = TRUE,
  lower.case = TRUE)

## S3 method for class 'character'
strip(x, char.keep = "~~", digit.remove = TRUE,
  apostrophe.remove = TRUE, lower.case = TRUE)

## S3 method for class 'factor'
strip(x, char.keep = "~~", digit.remove = TRUE,
  apostrophe.remove = TRUE, lower.case = TRUE)

## Default S3 method:
strip(x, char.keep = "~~", digit.remove = TRUE,
  apostrophe.remove = TRUE, lower.case = TRUE)

## S3 method for class 'list'
strip(x, char.keep = "~~", digit.remove = TRUE,
  apostrophe.remove = TRUE, lower.case = TRUE)

Arguments

x

The text variable.

char.keep

A character vector of symbols (i.e., punctuation) that strip should keep. The default is to strip every symbol except apostrophes and a double tilde "~~". The double tilde "~~" is included for a convenient means of keeping word groups together in functions that split text apart based on spaces. To remove double tildes "~~" set char.keep to NULL.

digit.remove

logical. If TRUE strips digits from the text.

apostrophe.remove

logical. If TRUE removes apostrophes from the output.

lower.case

logical. If TRUE forces all alpha characters to lower case.

Value

Returns a vector of text that has been stripped of unwanted characters.

See Also

rm_stopwords

Examples

1
2
3
4
5
6
7
## Not run: 
DATA$state #no strip applied
strip(DATA$state)
strip(DATA$state, apostrophe.remove=FALSE)
strip(DATA$state, char.keep = c("?", "."))

## End(Not run)

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.