text_to_words: Split string(s) of text 'x' into words.

View source: R/text_util_fun.R

text_to_wordsR Documentation

Split string(s) of text x into words.

Description

text_to_words splits a string of text x (consisting of one or more character strings) into a vector of its constituting words.

Usage

text_to_words(x)

Arguments

x

A string of text (required), typically a character vector.

Details

text_to_words removes all (standard) punctuation marks and empty spaces in the resulting text parts, before returning a vector of the remaining character symbols (as its words).

Internally, text_to_words uses strsplit to split strings at punctuation marks (split = "[[:punct:]]") and blank spaces (split = "( ){1,}").

Value

A character vector (of words).

See Also

text_to_words for splitting a text into its words; text_to_sentences for splitting text into a vector of sentences; text_to_chars for splitting text into a vector of characters; count_words for counting the frequency of words; strsplit for splitting strings.

Other text objects and functions: Umlaut, capitalize(), caseflip(), cclass, chars_to_text(), collapse_chars(), count_chars(), count_chars_words(), count_words(), invert_rules(), l33t_rul35, map_text_chars(), map_text_coord(), map_text_regex(), metachar, read_ascii(), text_to_chars(), text_to_sentences(), transl33t(), words_to_text()

Examples

# Default: 
x <- c("Hello!", "This is a 1st sentence.", "This is the 2nd sentence.", "The end.")
text_to_words(x)


hneth/ds4psy documentation built on May 1, 2024, 4:26 a.m.