get_tokens: Word Tokenization

View source: R/syuzhet.R

get_tokensR Documentation

Word Tokenization

Description

Parses a string into a vector of word tokens.

Usage

get_tokens(text_of_file, pattern = "\\W", lowercase = TRUE)

Arguments

text_of_file

A Text String

pattern

A regular expression for token breaking

lowercase

should tokens be converted to lowercase. Default equals TRUE

Value

A Character Vector of Words


syuzhet documentation built on Aug. 12, 2023, 1:05 a.m.