quotes: quotes dataset

Description Usage Format Source Examples

Description

A collection of 61,071 unique quotes for a variety of topics and from renowned personalities.

Usage

1

Format

A data.table object with 75,966 rows and the following columns:

Source

Data extracted from the Famous Quotes Database availiable on http://thewebminer.com/download.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# Unique quotes (repeated depending on the topic classification)
uniques <- unique(quotes$quote)
length(uniques)

# Wordcloud for a given topic

## Required packages
library(tm)
library(SnowballC)
library(wordcloud)
library(viridis)

## Preprocessing
top <- "time"
corpus <- Corpus(VectorSource(quotes[topic == top]$quote))
corpus <- tm_map(corpus, removePunctuation)
corpus <- tm_map(corpus, stripWhitespace)
corpus <- tm_map(corpus, removeNumbers)
corpus <- tm_map(corpus, removeWords, c(stopwords("english"), top))
corpus <- tm_map(corpus, stemDocument)

## Wordcloud
## Not run: 
wordcloud(corpus, max.words = 100, col = viridis(100))
title(paste("Wordcloud for topic \"", top, "\"", sep = ""))

## End(Not run)

egarpor/quotes documentation built on May 16, 2019, 12:13 a.m.