Spamchecker or 'Spam or Ham' is a package to predict whether a given text (email) is a spam or not. The data behind the model is obtained from https://archive.ics.uci.edu/ml/datasets/Spambase and we use SVM model to create the prediction. The drawback of this dataset is that it is based on university emails in 1999. Hence, it might not allow accurate predictions for general spam emails in recent times. Nevertheless, the concept of spam checking would still be similar.

This package contains the following functions: predictEmail and textToAttribute. Further description on each function can be found below.

Load package by running the following command. Please note that it will take a while as the package needs to build the SVM prediction model when loaded:

library("spamchecker")

Functions

predictEmail

Predict whether a given text is spam or not. This is done by calculating the attribute values of the text (using the textToAttribute function), then by plugging it into the SVM prediction model that was created during the library load.

myemail <- "Hi, this is my email. PLEASE DO NOT REPLY!"
predictEmail(myemail)

textToAttribute

Calculate the attributes of the text to be used with the prediction model. This method can be used to see what might affect the prediction.

myemail <- "Hi, this is my email. PLEASE DO NOT REPLY!"
textToAttribute(myemail)


megahf/spamfilter documentation built on May 29, 2019, 4:42 a.m.