get_tokens: Word Tokenization

View source: R/syuzhet.R

get_tokensR Documentation

Word Tokenization

Description

Parses a string into a vector of word tokens.

Usage

get_tokens(text_of_file, pattern = "\\W", lowercase = TRUE)

Arguments

text_of_file

A Text String

pattern

A regular expression for token breaking

lowercase

should tokens be converted to lowercase. Default equals TRUE

Value

A Character Vector of Words


mjockers/syuzhet documentation built on Aug. 22, 2023, 7:42 a.m.