IPPModel: Class providing object with methods for drawing IPPs and FIN
In XZPackage/IPPModel: Impact pattern plots and feature interaction networks

Description Usage Format Value Fields Methods References Examples

The class provides objects with methods for drawing impact pattern plots (IPPs) and feature interaction network (FIN).

IPPModel

R6Class object.

Object of R6Class with methods for drawing IPPs and FIN.

X.Data: a data.frame, the dataset of input features.
Pred.Fun: an object, the prediction function. It can be any model created by "nnet", "randomforest" and "kernlab" etc.
Model.Package: a string, the package name of the interpreted machine learning model, such as "nnet" and "randomforest".
Pred.Type: a string, the type of prediction.
Pred.Dimension: an integer, indicating which class is predicted. This field is used only for classification model.
XB.Size: an integer, the size of XB.Sample.
XB.SamplingMethod: a string, the sampling method of XB.Sample, "joint" or "independent". "joint" means that all features are sampled from X.Data jointly. "independent" means that each feature is sampled independently; then all features are combined randomly.
ParaTable: a data.frame, the parameter table. It is generated by method GenerateParaTable.
XA.Sample: a list, the sample of X_A extracted from X.Data. It is generated by method SamplingXA.
XB.Sample: a list, the sample of X_B extracted from X.Data. It is generated by method SamplingXB.
Pred.Res: a list, the prediction results of f(X_A,X_B), which is generated by method PredictData.
Clustering.Res: a list, the clustering results, which is generated by method ClusterImpactPlots.
TreeRules: a list, the decision tree rules, which is generated by method BuildTree.
FIN.Data: a data.frame, the feature interaction network, which is generated by method BuildTree.
ColorList: a list, the curve colors used for drawing IPPs.
TaskFinishTime: a list, the finishing time of tasks.

initialize: initialize some fields of object and excute the method CheckInitialization and GenerateParaTable.
CheckInitialization: validate the initialization information.
GenerateParaTable: Generate the parameter table ParaTable.
CheckParaTable: validate the information in ParaTable.
SamplingXA: sampling XA.Sample from X.Data.
SamplingXB: sampling XB.Sample from X.Data.
PredictData: predict data using Pred.Fun based on XA.Sample and XB.Sample.
ClusterImpactPlots: cluster the impact curves of each feature based on the predicting results Pred.Res.
BuildTree: build decision tree based on the clustering results Clustering.Res.
DrawIPP: draw the impact pattern plots.
DrawFIN: draw the feature interaction network.
WriteToExcel: write the results to an excel file.
ExecuteAll: execute the methods SamplingXA, SamplingXB, PredictData, ClusterImpactPlots and BuildTree in sequence.

Xiaohang Zhang, Ji Zhu, SuBang Choe, Yi Lu and Jing Liu. Exploring black box of supervised learning models: Visualizing the impact of features on prediction. Working paper.

library(IPPModel)
library(igraph)

#------ FIRST EXAMPLE ------
library(nnet)
data("bank")
# build model
bank.NN <- nnet(y ~ ., data = bank, size = 5, maxit = 1000)
# remove the output variable
bank.ds = bank[-17]
# create IPPModel object
IPP.bank = IPPModel$new(XDS=bank.ds, PredFun=bank.NN,
                        ModelPackage="nnet", PredType="raw", PredDim=1,
                        XB.Size=1000, XB.SamplingMethod="joint")
# modify the clustering method to "kmedoids"
IPP.bank$ParaTable$clusteringMethod = "kmedoids"
# execute all tasks
IPP.bank$ExecuteAll()
# draw impact pattern plots (IPP)
IPP.bank$DrawIPP(centralized = TRUE, nc = 4)
# draw feature interaction network (FIN)
IPP.bank$DrawFIN(threshold = 0.2, lay.out = igraph::layout.auto)
# write the results into an excel file
IPP.bank$WriteToExcel("output.xlsx")

#------ SECOND EXAMPLE ------
library(randomForest)
data("whitewine")
# build model
WW.RF <- randomForest(quality ~ ., data = whitewine, mtry = 4,importance=TRUE, na.action=na.omit)
# remove the output variable
WW.ds = whitewine[-12]
# create IPPModel object
IPP.WW = IPPModel$new(XDS=WW.ds, PredFun=WW.RF,
                      ModelPackage="randomForest", PredType="response", PredDim=1,
                      XB.Size=1000, XB.SamplingMethod="joint")
# set the maximum depth of trees to be 5
IPP.WW$ParaTable$treeDepth = 5
# execute all tasks
IPP.WW$ExecuteAll()
# draw impact pattern plots (IPP)
IPP.WW$DrawIPP(centralized = TRUE, nc = 4)
# draw feature interaction network (FIN)
IPP.WW$DrawFIN(threshold = 0.1, lay.out = igraph::layout.circle)

#------ THIRD EXAMPLE ------
library(kernlab)
data("iris")
iris.SVM <- ksvm(Species ~ ., data = iris,kernel="rbfdot", kpar="automatic",C=0.1, prob.model = TRUE)
# remove the output variable
iris.ds = iris[-5]
# create IPPModel object
IPP.iris = IPPModel$new(XDS=iris.ds, PredFun=iris.SVM,
                        ModelPackage="kernlab", PredType="prob", PredDim=1,
                        XB.Size=200, XB.SamplingMethod="independent")
# execute the tasks step by step
IPP.iris$SamplingXA()  # sampling XA
IPP.iris$SamplingXB()  # sampling XB
IPP.iris$PredictData()  # predict
IPP.iris$ClusterImpactPlots() # clustering impact plots
IPP.iris$BuildTree()  # build tree
# draw impact pattern plots (IPP)
IPP.iris$DrawIPP(centralized = TRUE, nc = 4)
# draw feature interaction network (FIN)
IPP.iris$DrawFIN(threshold = 0.3, lay.out = igraph::layout.auto)
# write the results into an excel file
IPP.iris$WriteToExcel("output.xlsx")