clean_corpus: Clean a tidy data frame

Description Usage Arguments Value Examples

Description

This is a function that cleans a tidy data frame representing a corpus of documents

Usage

1
clean_corpus(corpus, rm_special = FALSE, rm_numeric = FALSE)

Arguments

corpus

A data frame containing a column named text

rm_special

If TRUE, remove special (non-alphanumeric) characters

rm_numeric

If TRUE, remove numeric characters

Value

A data frame containing the cleaned corpus

Examples

1
2
library(janeaustenr)
corpus <- clean_corpus(austen_books())

cldatascience/tidygramr documentation built on May 10, 2019, 1:09 a.m.