CSMES.ensSel | R Documentation |
This function applies the first stage in the learning process of CSMES: optimizing Cost-Sensitive Multicriteria Ensemble Selection, resulting in a Pareto frontier of equivalent candidate ensemble classifiers along two objective functions. By default, cost space is optimized by optimizing false positive and false negative rates simultaneously. This results in a set of optimal ensemble classifiers, varying in the tradeoff between FNR and FPR. Optionally, other objective metrics can be specified. Currently, only binary classification is supported.
CSMES.ensSel( memberPreds, y, obj1 = c("FNR", "AUCC", "MSE", "AUC"), obj2 = c("FPR", "ensSize", "ensSizeSq", "clAmb"), selType = c("selection", "selectionWeighted", "weighted"), plotting = TRUE, generations = 30, popsize = 100 )
memberPreds |
matrix containing ensemble member library predictions |
y |
Vector with true class labels. Currently, a dichotomous outcome variable is supported |
obj1 |
Specifies the first objective metric to be minimized |
obj2 |
Specifies the second objective metric to be minimized |
selType |
Specifies the type of ensemble selection to be applied: |
plotting |
|
generations |
the number of population generations for nsga-II. Default is 30. |
popsize |
the population size for nsga-II. Default is 100. |
An object of the class CSMES.ensSel
which is a list with the following components:
weights |
ensemble member weights for all pareto-optimal ensemble classifiers after multicriteria ensemble selection |
obj_values |
optimization objective values |
pareto |
overview of pareto-optimal ensemble classifiers |
popsize |
the population size for nsga-II |
generarations |
the number of population generations for nsga-II |
obj1 |
Specifies the first objective metric that was minimized |
obj2 |
Specifies the second objective metric that was minimized |
selType |
the type of ensemble selection that was applied: |
ParetoPredictions_p |
probability predictions for pareto-optimal ensemble classifiers |
ParetoPredictions_c |
class predictions for pareto-optimal ensebmle classifiers |
Koen W. De Bock, kdebock@audencia.com
De Bock, K.W., Lessmann, S. And Coussement, K., Cost-sensitive business failure prediction when misclassification costs are uncertain: A heterogeneous ensemble selection approach, European Journal of Operational Research (2020), doi: 10.1016/j.ejor.2020.01.052.
##load data library(rpart) library(zoo) library(ROCR) library(mco) data(BFP) ##generate random order vector BFP_r<-BFP[sample(nrow(BFP),nrow(BFP)),] size<-nrow(BFP_r) ##size<-300 train<-BFP_r[1:floor(size/3),] val<-BFP_r[ceiling(size/3):floor(2*size/3),] test<-BFP_r[ceiling(2*size/3):size,] ##generate a list containing model specifications for 100 CART decisions trees varying in the cp ##and minsplit parameters, and trained on bootstrap samples (bagging) rpartSpecs<-list() for (i in 1:100){ data<-train[sample(1:ncol(train),size=ncol(train),replace=TRUE),] str<-paste("rpartSpecs$rpart",i,"=rpart(as.formula(Class~.),data,method=\"class\", control=rpart.control(minsplit=",round(runif(1, min = 1, max = 20)),",cp=",runif(1, min = 0.05, max = 0.4),"))",sep="") eval(parse(text=str)) } ##generate predictions for these models hillclimb<-mat.or.vec(nrow(val),100) for (i in 1:100){ str<-paste("hillclimb[,",i,"]=predict(rpartSpecs[[i]],newdata=val)[,2]",sep="") eval(parse(text=str)) } ##score the validation set used for ensemble selection, to be used for ensemble selection ESmodel<-CSMES.ensSel(hillclimb,val$Class,obj1="FNR",obj2="FPR",selType="selection", generations=10,popsize=12,plot=TRUE) ## Create Ensemble nomination curve enc<-CSMES.ensNomCurve(ESmodel,hillclimb,val$Class,curveType="costCurve",method="classPreds", plot=FALSE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.