emails: Email Subject Lines

Description Usage Format Source

Description

This dataset includes a set of email subject lines used for classification of whether the message is spam (unsolicited commercial content) or not. Many subject lines include subject matter innapropriate for classroom use. Given the volume of headlines containing such language (especially for type == "spam"), user discretion is advised.

Usage

1

Format

A data frame with 6,908 rows and 3 variables:

subjectline

character Email subject line

type

character Email classification into three levels: spam, hard_ham, and easy_ham

ids

integer Row number

Source

http://www.rdatasciencecases.org/Spam


leahannejohnson/textclassificationexamples documentation built on Feb. 7, 2022, 11:04 p.m.