rake: Rapid automatic extraction of keywords from documents

Description Usage Arguments Value Examples

Description

This is the main functions for extracting and returning ranked keywords. This algorithm works by tokenizing words and phrases based on stopwords. After tokenization the frequency of tokens and their degree \(i.e. their coocurrence with words in the remaining phrases\) are used to rank phrases. Stopwords can include both common english words and can be enhanced by including a vector of domain specific stop words.

Usage

1
2
rake(x, split_words = smart_stop_words(), split_punct = basic_punct(),
  top_fraction = 1/3)

Arguments

x

a character vector of texts to find keywords for

top_fraction

the fraction of the most highly ranked phrases to return

method

one of "degreeFreq", "degree", or "freq". degreeFreq is simply degree // frequency.

Value

returns a list with elements composed of one named integer vector for each document. key phrases are names and values are ranked

Examples

1
2
rake(test_text, 5, "degree")
rake(test_text, 15)

lmkirvan/rakeR documentation built on May 14, 2019, 1:46 p.m.