Description Usage Format Source Examples
The data consist of 4601 email items, of which 1813 items were identified as spam. This is a subset of the full dataset, with six only of the 57 explanatory variables in the complete dataset.
1 |
Columns included are:
total length of words in capitals
number of occurrences of the \$ symbol
number of occurrences of the ! symbol
number of occurrences of the word ‘money’
number of occurrences of the string ‘000’
number of occurrences of the word ‘make’
outcome variable, a factor with levels
n
not spam,
y
spam
George Forman, Hewlett-Packard Laboratories
The complete dataset, and documentation, are available from Spam database
1 2 3 4 5 |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.