tokenize_text: Clean up text into tokens

View source: R/NLP.R

tokenize_textR Documentation

Clean up text into tokens

Description

Removes stop words, punctuation, auxiliary verbs. Lemmatizes text. Changes to lower-case.

Usage

tokenize_text(corpus)

Arguments

corpus

A vector of text documents.

Value

A cleaned-up vector of text documents.


bakaburg1/BaySREn documentation built on March 30, 2022, 12:16 a.m.