spambase | R Documentation |
This is a well known dataset with a binary target obtainable from the UCI machine learning dataset archive. Each row is an e-mail, which is considered to be either spam or not spam. The dataset contains 48 attributes that measure the percentage of times a particular word appears in the email, 6 attributes that measure the percentage of times a particular character appeared in the email, plus three attributes measuring run-lengths of capital letters.
data(spambase)
A data frame with 4,601 rows and 58 variables (1 categorical, 57 continuous).
is.spam
Is the email considered to be spam? (0=no,1=yes)
word.freq.make
Percentage of times the word 'make' appeared in the e-mail
word.freq.address
Percentage of times the word 'address' appeared in the e-mail
word.freq.all
Percentage of times the word 'all' appeared in the e-mail
word.freq.3d
Percentage of times the word '3d' appeared in the e-mail
word.freq.our
Percentage of times the word 'our' appeared in the e-mail
word.freq.over
Percentage of times the word 'over' appeared in the e-mail
word.freq.remove
Percentage of times the word 'remove' appeared in the e-mail
word.freq.internet
Percentage of times the word 'internet' appeared in the e-mail
word.freq.order
Percentage of times the word 'order' appeared in the e-mail
word.freq.mail
Percentage of times the word 'mail' appeared in the e-mail
word.freq.receive
Percentage of times the word 'receive' appeared in the e-mail
word.freq.will
Percentage of times the word 'will' appeared in the e-mail
word.freq.people
Percentage of times the word 'people' appeared in the e-mail
word.freq.report
Percentage of times the word 'report' appeared in the e-mail
word.freq.addresses
Percentage of times the word 'addresses' appeared in the e-mail
word.freq.free
Percentage of times the word 'free' appeared in the e-mail
word.freq.business
Percentage of times the word 'business' appeared in the e-mail
word.freq.email
Percentage of times the word 'email' appeared in the e-mail
word.freq.you
Percentage of times the word 'you' appeared in the e-mail
word.freq.credit
Percentage of times the word 'credit' appeared in the e-mail
word.freq.your
Percentage of times the word 'your' appeared in the e-mail
word.freq.font
Percentage of times the word 'font' appeared in the e-mail
word.freq.000
Percentage of times the word '000' appeared in the e-mail
word.freq.money
Percentage of times the word 'money' appeared in the e-mail
word.freq.hp
Percentage of times the word 'hp' appeared in the e-mail
word.freq.hpl
Percentage of times the word 'hpl' appeared in the e-mail
word.freq.george
Percentage of times the word 'george' appeared in the e-mail
word.freq.650
Percentage of times the word '650' appeared in the e-mail
word.freq.lab
Percentage of times the word 'lab' appeared in the e-mail
word.freq.labs
Percentage of times the word 'labs' appeared in the e-mail
word.freq.telnet
Percentage of times the word 'telnet' appeared in the e-mail
word.freq.857
Percentage of times the word '857' appeared in the e-mail
word.freq.data
Percentage of times the word 'data' appeared in the e-mail
word.freq.415
Percentage of times the word '415' appeared in the e-mail
word.freq.85
Percentage of times the word '85' appeared in the e-mail
word.freq.technology
Percentage of times the word 'technology' appeared in the e-mail
word.freq.1999
Percentage of times the word '1999' appeared in the e-mail
word.freq.parts
Percentage of times the word 'parts' appeared in the e-mail
word.freq.pm
Percentage of times the word 'pm' appeared in the e-mail
word.freq.direct
Percentage of times the word 'direct' appeared in the e-mail
word.freq.cs
Percentage of times the word 'cs' appeared in the e-mail
word.freq.meeting
Percentage of times the word 'meeting' appeared in the e-mail
word.freq.original
Percentage of times the word 'original' appeared in the e-mail
word.freq.project
Percentage of times the word 'project' appeared in the e-mail
word.freq.re
Percentage of times the word 're' appeared in the e-mail
word.freq.edu
Percentage of times the word 'edu' appeared in the e-mail
word.freq.table
Percentage of times the word 'table' appeared in the e-mail
word.freq.conference
Percentage of times the word 'conference' appeared in the e-mail
char.freq.;
Percentage of times the character ';' appeared in the e-mail
char.freq.(
Percentage of times the character '(' appeared in the e-mail
char.freq.[
Percentage of times the character '[' appeared in the e-mail
char.freq.!
Percentage of times the character '!' appeared in the e-mail
char.freq.$
Percentage of times the character '$' appeared in the e-mail
char.freq.#
Percentage of times the character '#' appeared in the e-mail
capital.run.length.average
Average length of contiguous runs of capital letters in the e-mail
capital.run.length.longest
Maximum length of contiguous runs of capital letters in the e-mail
capital.run.length.total
Total number of capital letters in the e-mail
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.