text_summarize: Summarize the key points from input text

Description Usage Arguments Details Value Examples

Description

This function returns a DataFrame with total word count, total sentence count, most common and least common word, average word length, and average sentence length. Each information resides in a separate column.

Usage

1
2
text_summarize(txt, stop_remove = FALSE, remove_punctuation = TRUE,
  remove_number = TRUE, case_sensitive = FALSE)

Arguments

txt

string

stop_remove

Boolean

remove_punctuation

Boolean

remove_number

Boolean

case_sensitive

Boolean

Details

Created on 09 February, 2019

Authors: Harjyot Kaur

Takes in a string and returns a data.frame with one row and six columns:

First column contains the total word count of the string.

Second column contains the total number of sentences in text.

Third column contains a list of the most common words in text. If this returns a list of length 1, there is only one most common word. If this returns a list of length > 1, there are multiple words that appear the most number of times in text.

Fourth column contains a list of the least common words in text. If this returns a list of length 1, there is only one least common word. If this returns a list of length > 1, there are multiple words that appear the least number of times in text.

Fifth column contains the average word length in text.

Sixth column contains the average number of words in a sentence, in text.

Value

data.frame

Examples

1
2
3
4
txt <- "This is the first sentence in this paragraph.
        This is the second sentence. This is the third."

summary <- text_summarize(txt)

UBC-MDS/RSyntext documentation built on May 7, 2019, 7:14 p.m.