bag_o_words: Bag of Words

View source: R/bag_o_words.R

bag_o_wordsR Documentation

Bag of Words

Description

bag_o_words - Reduces a text column to a bag of words.

unbag - Wrapper for paste(collapse=" ") to glue words back into strings.

breaker - Reduces a text column to a bag of words and qdap recognized end marks.

word_split - Reduces a text column to a list of vectors of bag of words and qdap recognized end marks (i.e., ".", "!", "?", "*", "-").

Usage

bag_o_words(text.var, apostrophe.remove = FALSE, ...)

unbag(text.var, na.rm = TRUE)

breaker(text.var)

word_split(text.var)

Arguments

text.var

The text variable.

apostrophe.remove

logical. If TRUE removes apostrophe's from the output.

na.rm

logical. If TRUE NAs are removed before pasting.

...

Additional arguments passed to strip.

Value

Returns a vector of stripped words.

unbag - Returns a string.

breaker - Returns a vector of striped words and qdap recognized endmarks (i.e., ".", "!", "?", "*", "-").

Examples

## Not run: 
bag_o_words("I'm going home!")
bag_o_words("I'm going home!", apostrophe.remove = TRUE)
unbag(bag_o_words("I'm going home!"))

bag_o_words(DATA$state)
by(DATA$state, DATA$person, bag_o_words)
lapply(DATA$state,  bag_o_words)

breaker(DATA$state)
by(DATA$state, DATA$person, breaker)
lapply(DATA$state,  breaker)
unbag(breaker(DATA$state))

word_split(c(NA, DATA$state))
unbag(word_split(c(NA, DATA$state)))

## End(Not run)

qdap documentation built on May 31, 2023, 5:20 p.m.