IPPModel: Class providing object with methods for drawing IPPs and FIN

Description Usage Format Value Fields Methods References Examples

Description

The class provides objects with methods for drawing impact pattern plots (IPPs) and feature interaction network (FIN).

Usage

1

Format

R6Class object.

Value

Object of R6Class with methods for drawing IPPs and FIN.

Fields

X.Data

a data.frame, the dataset of input features.

Pred.Fun

an object, the prediction function. It can be any model created by "nnet", "randomforest" and "kernlab" etc.

Model.Package

a string, the package name of the interpreted machine learning model, such as "nnet" and "randomforest".

Pred.Type

a string, the type of prediction.

Pred.Dimension

an integer, indicating which class is predicted. This field is used only for classification model.

XB.Size

an integer, the size of XB.Sample.

XB.SamplingMethod

a string, the sampling method of XB.Sample, "joint" or "independent". "joint" means that all features are sampled from X.Data jointly. "independent" means that each feature is sampled independently; then all features are combined randomly.

ParaTable

a data.frame, the parameter table. It is generated by method GenerateParaTable.

XA.Sample

a list, the sample of X_A extracted from X.Data. It is generated by method SamplingXA.

XB.Sample

a list, the sample of X_B extracted from X.Data. It is generated by method SamplingXB.

Pred.Res

a list, the prediction results of f(X_A,X_B), which is generated by method PredictData.

Clustering.Res

a list, the clustering results, which is generated by method ClusterImpactPlots.

TreeRules

a list, the decision tree rules, which is generated by method BuildTree.

FIN.Data

a data.frame, the feature interaction network, which is generated by method BuildTree.

ColorList

a list, the curve colors used for drawing IPPs.

TaskFinishTime

a list, the finishing time of tasks.

Methods

initialize

initialize some fields of object and excute the method CheckInitialization and GenerateParaTable.

CheckInitialization

validate the initialization information.

GenerateParaTable

Generate the parameter table ParaTable.

CheckParaTable

validate the information in ParaTable.

SamplingXA

sampling XA.Sample from X.Data.

SamplingXB

sampling XB.Sample from X.Data.

PredictData

predict data using Pred.Fun based on XA.Sample and XB.Sample.

ClusterImpactPlots

cluster the impact curves of each feature based on the predicting results Pred.Res.

BuildTree

build decision tree based on the clustering results Clustering.Res.

DrawIPP

draw the impact pattern plots.

DrawFIN

draw the feature interaction network.

WriteToExcel

write the results to an excel file.

ExecuteAll

execute the methods SamplingXA, SamplingXB, PredictData, ClusterImpactPlots and BuildTree in sequence.

References

Xiaohang Zhang, Ji Zhu, SuBang Choe, Yi Lu and Jing Liu. Exploring black box of supervised learning models: Visualizing the impact of features on prediction. Working paper.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
library(IPPModel)
library(igraph)

#------ FIRST EXAMPLE ------
library(nnet)
data("bank")
# build model
bank.NN <- nnet(y ~ ., data = bank, size = 5, maxit = 1000)
# remove the output variable
bank.ds = bank[-17]
# create IPPModel object
IPP.bank = IPPModel$new(XDS=bank.ds, PredFun=bank.NN,
                        ModelPackage="nnet", PredType="raw", PredDim=1,
                        XB.Size=1000, XB.SamplingMethod="joint")
# modify the clustering method to "kmedoids"
IPP.bank$ParaTable$clusteringMethod = "kmedoids"
# execute all tasks
IPP.bank$ExecuteAll()
# draw impact pattern plots (IPP)
IPP.bank$DrawIPP(centralized = TRUE, nc = 4)
# draw feature interaction network (FIN)
IPP.bank$DrawFIN(threshold = 0.2, lay.out = igraph::layout.auto)
# write the results into an excel file
IPP.bank$WriteToExcel("output.xlsx")

#------ SECOND EXAMPLE ------
library(randomForest)
data("whitewine")
# build model
WW.RF <- randomForest(quality ~ ., data = whitewine, mtry = 4,importance=TRUE, na.action=na.omit)
# remove the output variable
WW.ds = whitewine[-12]
# create IPPModel object
IPP.WW = IPPModel$new(XDS=WW.ds, PredFun=WW.RF,
                      ModelPackage="randomForest", PredType="response", PredDim=1,
                      XB.Size=1000, XB.SamplingMethod="joint")
# set the maximum depth of trees to be 5
IPP.WW$ParaTable$treeDepth = 5
# execute all tasks
IPP.WW$ExecuteAll()
# draw impact pattern plots (IPP)
IPP.WW$DrawIPP(centralized = TRUE, nc = 4)
# draw feature interaction network (FIN)
IPP.WW$DrawFIN(threshold = 0.1, lay.out = igraph::layout.circle)

#------ THIRD EXAMPLE ------
library(kernlab)
data("iris")
iris.SVM <- ksvm(Species ~ ., data = iris,kernel="rbfdot", kpar="automatic",C=0.1, prob.model = TRUE)
# remove the output variable
iris.ds = iris[-5]
# create IPPModel object
IPP.iris = IPPModel$new(XDS=iris.ds, PredFun=iris.SVM,
                        ModelPackage="kernlab", PredType="prob", PredDim=1,
                        XB.Size=200, XB.SamplingMethod="independent")
# execute the tasks step by step
IPP.iris$SamplingXA()  # sampling XA
IPP.iris$SamplingXB()  # sampling XB
IPP.iris$PredictData()  # predict
IPP.iris$ClusterImpactPlots() # clustering impact plots
IPP.iris$BuildTree()  # build tree
# draw impact pattern plots (IPP)
IPP.iris$DrawIPP(centralized = TRUE, nc = 4)
# draw feature interaction network (FIN)
IPP.iris$DrawFIN(threshold = 0.3, lay.out = igraph::layout.auto)
# write the results into an excel file
IPP.iris$WriteToExcel("output.xlsx")

XZPackage/IPPModel documentation built on May 17, 2019, 6:36 p.m.