Cocktail Ensemble: build a model consisting of multiple classifiers.

Share:

Description

cocktailEnsemble is a classification algorithm. It builds four models by calling glm (logit), kernelFactory, randomForest, and ada.

Usage

1

Arguments

x

A data frame containing the predictors.

y

The response vector.

Value

An object of type cocktailEnsemble containing the four aforementioned models.

Author(s)

Dirk Van den Poel, Michel Ballings, Andrey Volkov, Jeroen D”haen, Michiel Van Herwegen

Maintainer: Michel Ballings <Michel.Ballings@GMail.com>

References

Van den Poel, D., Ballings, M., Volkov, A., D”haen, J., Vanherwegen, M., Predictive Analytics for analytical Customer Relationship Management using SAS, Oracle and R, Springer, Forthcoming.

glm:

  • Dobson, A. J. (1990) An Introduction to Generalized Linear Models. London: Chapman and Hall.

  • Hastie, T. J., & Pregibon, D. (1992) Generalized linear models. Chapter 6 of Statistical Models in S eds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole.

  • McCullagh P., & Nelder, J. A. (1989) Generalized Linear Models. London: Chapman and Hall.

  • Venables, W. N., & Ripley, B. D. (2002) Modern Applied Statistics with S. New York: Springer.

randomForest:

  • Liaw, A. & Wiener, M. (2002). Classification and Regression by randomForest. R News 2(3), 18–22.

  • Breiman, L. (2001), Random Forests, Machine Learning 45(1), 5-32.

kernelFactory:

  • Ballings, M., & Van den Poel, D. (2012). Kernel Factory: An ensemble of Kernel Machines. Expert Systems With Applications. Forthcoming.

  • Ballings, M., & Van den Poel, D. (2012). kernelFactory: An ensemble of kernel machines. R package version 0.1.1 http://cran.r-project.org/web/packages/kernelFactory.

ada:

  • Culp, M., Johnson, K., & Michailidis, G. (2012). ada: ada: an R package for stochastic boosting. R package version 2.0-3. http://CRAN.R-project.org/package=ada

  • Friedman, J. (1999). Greedy Function Approximation: A Gradient Boosting Machine. Technical Report, Department of Statistics, Standford University.

  • Friedman, J., Hastie, T., and Tibshirani, R. (2000). Additive Logistic Regression: A statistical view of boosting. Annals of Statistics, 28(2), 337-374.

  • Friedman, J. (2002). Stochastic Gradient Boosting. Coputational Statistics \& Data Analysis 38.

  • Culp, M., Johnson, K., & Michailidis, G. (2006). ada: an R Package for Stochastic Boosting Journal of Statistical Software, 16.

See Also

Other functions in this package: imputeMissings, Aggregate, cocktailEnsemble, predict.cocktailEnsemble

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
#Credit Approval data available at UCI Machine Learning Repository
data(Credit)

#Create training set (take a small subset for demonstration purposes)
Credit <- data.frame(Credit[order(runif(nrow(Credit ))),])[1:100,c('V2','V3','V8','V11','V14','V15','Response')]
trainingset <- Credit[1:1:floor(0.50*nrow(Credit)),]
#Create test set
#testset <- Credit[(floor(0.50*nrow(Credit))+1 ):nrow(Credit),]


#Train Cocktail Ensemble on training data
cE <- cocktailEnsemble(x=trainingset[,names(trainingset)!= "Response"],y=trainingset$Response)

#Deploy Kernel Factory to predict response for test data
#pred <- predict(cE,testset[,names(testset)!= "Response"])