This package for R implements the MARC algorithm aimed at postprocesing output of a CBA classifier.
Liu, B. Hsu, W. and Ma, Y (1998). Integrating Classification and Association Rule Mining. Proceedings KDD-98, New York, 27-31 August. AAAI. pp 80-86.
The arc package is used for generation of the CBA classifier.
Package can be installed from the R environment using the devtools package.
devtools::install_github("kliegr/marc")
library(rMARC)
library(mlbench)
data("PimaIndiansDiabetes")
set.seed(111)
allData <- PimaIndiansDiabetes[sample(nrow(PimaIndiansDiabetes)),]
trainFold <- allData[1:500,]
testFold <- allData[501:nrow(PimaIndiansDiabetes),]
rmCBA <- cba(trainFold, classAtt="diabetes")
prediction <- predict(rmCBA,testFold)
acc <- CBARuleModelAccuracy(prediction, testFold[[rmCBA@classAtt]])
print(paste("CBA Model with ",length(rmCBA@rules), " rules and accuracy ",acc))
rmMARC <- marcExtend(cbaRuleModel=rmCBA,datadf=trainFold,continuousPruning=FALSE, postpruning=TRUE, fuzzification=FALSE, annotate=FALSE,ruleOutputPath="rules.xml")
prediction <- predict(rmMARC,testFold,"oneRule")
acc <- CBARuleModelAccuracy(prediction, testFold[[rmMARC@classAtt]])
print(paste("MARC Model with ",rmMARC@ruleCount, " rules and accuracy ",acc))
print(rmMARC@rules)
Output
[1] CBA Model with 53 rules and accuracy 0.753731343283582
[1] MARC Model with 47 rules and accuracy 0.753731343283582
MARC decreased the number of rules while keeping same accuracy.
If we actived the continuousPruning option, it would result in aggressive pruning:
[1] MARC Model with 28 rules and accuracy 0.67910447761194
library(rMARC)
library(mlbench)
data("Ionosphere")
library(rMARC)
set.seed(111)
allData <- Ionosphere[sample(nrow(Ionosphere)),]
trainFold <- allData[1:300,]
testFold <- allData[301:nrow(Ionosphere),]
rmCBA <- cba(trainFold, classAtt="Class")
prediction <- predict(rmCBA,testFold)
acc <- CBARuleModelAccuracy(prediction, testFold[[rmCBA@classAtt]])
print(paste("CBA Model with ",length(rmCBA@rules), " rules and accuracy ",acc))
rmMARC <- marcExtend(cbaRuleModel=rmCBA,datadf=trainFold,continuousPruning=TRUE, postpruning=TRUE, fuzzification=FALSE, annotate=TRUE,ruleOutputPath="rules.xml")
prediction <- predict(rmMARC,testFold,"mixture")
acc <- CBARuleModelAccuracy(prediction, testFold[[rmMARC@classAtt]])
print(paste("MARC Model with ",rmMARC@ruleCount, " rules and accuracy ",acc))
print(rmMARC@rulePath)
Output
[1] CBA Model with 14 rules and accuracy 0.941176470588235
[1] MARC Model with 14 rules and accuracy 0.96078431372549
MARC improved accuracy while keeping the number of rules. Rules in MARC multi rule model cannot be currently visualized - they are stored in a file.
A lightweight benchmarking framework for MARC is available as marcbench.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.