Description Usage Arguments Details Value References See Also Examples
epx
forms phalanxes of variables from training data for
binary classification with a rare class. The phalanxes are
disjoint subsets of variables, each of which is fit with a base classifier.
Together they form an ensemble.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 
x 
Explanatory variables (predictors, features) contained in a data frame. 
y 
Binary response variable vector (numeric or integer): 1 for the rare class, 0 for the majority class. 
phalanxes.initial 
Initial variable group indices; default one group per variable. Example: vector c(1, 1, 2, 2, 3, ...) puts variables 1 and 2 in group 1, variables 3 and 4 in group, 2, etc. Indices cannot be skipped, e.g., c( 1, 3, 3, 4, 4, 3, 1) skips group 2 and is invalid. 
alpha 
Lowertail probability for the critical quantile of the reference
distribution of the 
nsim 
Number of simulations for the reference empirical distribution of the performance measure; default is 1000. 
rmin.target 
To merge the pair of groups with the
minimum ratio of performance measures (ensemble of models to single model)
into a single group their ratio must be less than

classifier 
Base classifier, one of

classifier.args 
Arguments for the base 
performance 
Performance assessment metric, one of

performance.args 
Arguments for the 
computing 
Whether to compute sequentially or in parallel. Input is one
of 
... 
Further arguments passed to or from other methods. 
Please see Tomal et al. (2015) for more description of phalanx formation.
Returns an object of class epx
, which is
a list containing the following components:
PHALANXES 
List of four vectors, each the same length as the number of
explanatory variables (columns in 
PHALANXES.FINAL.PERFORMANCE 
Vector of 
PHALANXES.FINAL.FITS 
A matrix with number of rows equal to the number
of observations in the training data and number of columns equal to the
number of final phalanxes. Column i contains the predicted
probabilities of class 1 from fitting the base 
ENSEMBLED.FITS 
The predicted probabilities of class 1 from the
ensemble of phalanxes based on 
BASE.CLASSIFIER.ARGS 
(Parsed) record of userspecified arguments for

PERFORMANCE.ARGS 
(Parsed) record of userspecified arguments for

X 
Userprovided data frame of explanatory variables. 
Y 
Userprovided binary response vector. 
Tomal, J. H., Welch, W. J., & Zamar, R. H. (2015). Ensembling classification models based on phalanxes of variables with applications in drug discovery. The Annals of Applied Statistics, 9(1), 6993. doi: 10.1214/14AOAS778
summary.epx
prints a summary of the results,
and cv.epx
assesses performance via crossvalidation.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25  # Example with data(harvest)
## Phalanxformation using a base classifier with 50 trees (default = 500)
set.seed(761)
model < epx(x = harvest[, 4], y = harvest[, 4],
classifier.args = list(ntree = 50))
## Phalanxmembership of explanatory variables at the four stages
## of phalanx formation (0 means not in a phalanx)
model$PHALANXES
## Summary of the final phalanxes (matches above)
summary(model)
## Not run:
## Parallel computing
clusters < parallel::detectCores()
cl < parallel::makeCluster(clusters)
doParallel::registerDoParallel(cl)
set.seed(761)
model.par < epx(x = harvest[, 4], y = harvest[, 4],
computing = "parallel")
parallel::stopCluster(cl)
## End(Not run)

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.