get_freq_terms: Get the frequency of each term in a character vector

Description Usage Arguments Value Examples

Description

This function converts the input character vector into a corpus and applies the clean_corpus function from this package. The resulting corpus is converted into a document term matrix, from which the number of times each term occurs is summed to get the frequency.

Usage

1
get_freq_terms(vec, stopwords = NULL)

Arguments

vec

character vector to get frequencies from

stopwords

optional, adds stopwords to remove from the corpus. If not specified it will only remove English stopwords from the tm package. Argument should be input in the form of a string or character vector.

Value

Outputs a data frame. Each row represents a word in the character vector with its respective frequency and proportion of times it occured in the vector. Ordered from most to least frequent.

Examples

1
get_freq_terms(words, stopwords = c("remove", "these"))

loshita/oshitar documentation built on May 8, 2019, 11:12 p.m.