Description Usage Arguments Details Value Author(s) References See Also Examples
Fits the GAMbag, GAMrsm or GAMens ensemble algorithms for binary classification using generalized additive models as base classifiers.
1 2 
formula 
a formula, as in the 
data 
a data frame in which to interpret the variables named in

rsm_size 
an integer, the number of variables to use for random
feature subsets used in the Random Subspace Method. Default is 2. If

autoform 
if 
iter 
an integer, the number of base classifiers (GAMs) in the
ensemble. Defaults to 
df 
an integer, the number of degrees of freedom (df) used for
smoothing spline estimation. Its value is only used when 
bagging 
enables Bagging if value is 
rsm 
enables Random Subspace Method (RSM) if value is 
fusion 
specifies the fusion rule for the aggregation of member
classifier outputs in the ensemble. Possible values are 
The GAMens
function applies the GAMbag, GAMrsm or GAMens ensemble
classifiers (De Bock et al., 2010) to a data set. GAMens is the default with
(bagging=TRUE
and rsm=TRUE
. For GAMbag, rsm
should be
specified as FALSE
. For GAMrsm, bagging
should be
FALSE
.
The GAMens
function provides the possibility for automatic formula
specification. In this case, dichotomous variables in data
are
included as linear terms, and other variables are assumed continuous,
included as nonparametric terms, and estimated by means of smoothing
splines. To enable automatic formula specification, use the generic formula
[response variable name]~.
in combination with autoform =
TRUE
. Note that in this case, all variables available in data
are
used in the model. If a formula other than [response variable name]~.
is specified then the autoform
option is automatically overridden. If
autoform=FALSE
and the generic formula [response variable
name]~.
is specified then the GAMs in the ensemble will not contain
nonparametric terms (i.e., will only consist of linear terms).
Four alternative fusion rules for member classifier outputs can be
specified. Possible values are 'avgagg'
for average aggregation
(default), 'majvote'
for majority voting, 'w.avgagg'
for
weighted average aggregation, or 'w.majvote'
for weighted majority
voting. Weighted approaches are based on member classifier error rates.
An object of class GAMens
, which is a list with the following
components:
GAMs 
the member GAMs in the ensemble. 
formula 
the
formula used tot create the 
iter 
the ensemble size. 
df 
number of degrees of freedom (df) used for smoothing spline estimation. 
rsm 
indicates whether the Random
Subspace Method was used to create the 
bagging 
indicates whether bagging was used to create the

rsm_size 
the number of variables used for random feature subsets. 
fusion_method 
the fusion rule that was used to combine member classifier outputs in the ensemble. 
probs 
the class membership probabilities, predicted by the ensemble classifier. 
class 
the class predicted by the ensemble classifier. 
samples 
an array indicating, for every base classifier in the ensemble, which observations were used for training. 
weights 
a
vector with weights defined as (1  error rate). Usage depends upon
specification of 
Koen W. De Bock [email protected], Kristof Coussement [email protected] and Dirk Van den Poel [email protected]
De Bock, K.W. and Van den Poel, D. (2012): "Reconciling Performance and Interpretability in Customer Churn Prediction Modeling Using Ensemble Learning Based on Generalized Additive Models". Expert Systems With Applications, Vol 39, 8, pp. 6816–6826.
De Bock, K. W., Coussement, K. and Van den Poel, D. (2010): "Ensemble Classification based on generalized additive models". Computational Statistics & Data Analysis, Vol 54, 6, pp. 1535–1546.
Breiman, L. (1996): "Bagging predictors". Machine Learning, Vol 24, 2, pp. 123–140.
Hastie, T. and Tibshirani, R. (1990): "Generalized Additive Models", Chapman and Hall, London.
Ho, T. K. (1998): "The random subspace method for constructing decision forests". IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol 20, 8, pp. 832–844.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28  ## Load data (mlbench library should be loaded)
library(mlbench)
data(Ionosphere)
IonosphereSub<Ionosphere[,c("V1","V2","V3","V4","V5","Class")]
## Train GAMens using all variables in Ionosphere dataset
Ionosphere.GAMens < GAMens(Class~., IonosphereSub ,4 , autoform=TRUE,
iter=10 )
## Compare classification performance of GAMens, GAMrsm and GAMbag ensembles,
## using 4 nonparametric terms and 2 linear terms
Ionosphere.GAMens < GAMens(Class~s(V3,4)+s(V4,4)+s(V5,3)+s(V6,5)+V7+V8,
Ionosphere ,3 , autoform=FALSE, iter=10 )
Ionosphere.GAMrsm < GAMens(Class~s(V3,4)+s(V4,4)+s(V5,3)+s(V6,5)+V7+V8,
Ionosphere ,3 , autoform=FALSE, iter=10, bagging=FALSE, rsm=TRUE )
Ionosphere.GAMbag < GAMens(Class~s(V3,4)+s(V4,4)+s(V5,3)+s(V6,5)+V7+V8,
Ionosphere ,3 , autoform=FALSE, iter=10, bagging=TRUE, rsm=FALSE )
## Calculate AUCs (for function colAUC, load caTools library)
library(caTools)
GAMens.auc < colAUC(Ionosphere.GAMens[[9]], Ionosphere["Class"]=="good",
plotROC=FALSE)
GAMrsm.auc < colAUC(Ionosphere.GAMrsm[[9]], Ionosphere["Class"]=="good",
plotROC=FALSE)
GAMbag.auc < colAUC(Ionosphere.GAMbag[[9]], Ionosphere["Class"]=="good",
plotROC=FALSE)

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.