View source: R/cleansing_corpus.R
cleansing_corpus | R Documentation |
The function performs text cleansing by removing escape characters, non alphanumeric, long-words, excess space, and turns all letters to lower case.
cleansing_corpus( text, escape_chars = TRUE, nonalphanum = TRUE, longwords = TRUE, whitespace = TRUE, tolower = TRUE )
text |
Character vector of free text to be cleansed. |
escape_chars |
If TRUE, removes escape characters for |
nonalphanum |
If TRUE, removes non-alphanumeric characters. |
longwords |
If TRUE, removes words with more than 35 characters. |
whitespace |
If TRUE, removes excess whitespace. |
tolower |
If TRUE, turns letters to lower. |
A character vector of the cleansed text.
txt <- "It has roots in a piece of classical Latin literature from 45 BC" cleansing_corpus(txt)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.