Description Usage Details Source References Examples
The spam dataset is available and fully described on the UCI spambase directory, and has been used for instance in Hastie et al. (2001). The dataset is a collection of 4601 spam and non-spam e-mails, described by 57 continuous variables (and the nominal class label).
1 |
A data frame with 4601 observations on the following 58 variables:
[,1:57]
numeric Descriptors of the email contents, mostly word or character appearance percentage. See the UCI spambase directory for more information.
[,58]
factor Labels: regular (0) or spam (1) email.
UCI spambase directory :
ftp://ftp.ics.uci.edu/pub/machine-learning-databases/spambase/
Hastie, T. Tibshirani, R. and Friedman, J. (2001) Chapter 9 of The Elements of Statistical Learning: Data Mining, Inference, and Prediction, eds Springer, New York
1 |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.