# gpe: Derive a General Prediction Ensemble (gpe) In pre: Prediction Rule Ensembles

## Description

Provides an interface for deriving sparse prediction ensembles where basis functions are selected through L1 penalization.

## Usage

 ``` 1 2 3 4 5 6 7 8 9 10``` ```gpe( formula, data, base_learners = list(gpe_trees(), gpe_linear()), weights = rep(1, times = nrow(data)), sample_func = gpe_sample(), verbose = FALSE, penalized_trainer = gpe_cv.glmnet(), model = TRUE ) ```

## Arguments

 `formula` Symbolic description of the model to be fit of the form `y ~ x1 + x2 + ...+ xn`. If the output variable (left-hand side of the formula) is a factor, an ensemble for binary classification is created. Otherwise, an ensemble for prediction of a continuous variable is created. `data` `data.frame` containing the variables in the model. `base_learners` List of functions which has formal arguments `formula`, `data`, `weights`, `sample_func`, `verbose`, and `family` and returns a vector of characters with terms for the final formula passed to `cv.glmnet`. See `gpe_linear`, `gpe_trees`, and `gpe_earth`. `weights` Case weights with length equal to number of rows in `data`. `sample_func` Function used to sample when learning with base learners. The function should have formal argument `n` and `weights` and return a vector of indices. See `gpe_sample`. `verbose` `TRUE` if comments should be posted throughout the computations. `penalized_trainer` Function with formal arguments `x`, `y`, `weights`, `family` which returns a fit object. This can be changed to test other "penalized trainers" (like other function that perform an L1 penalty or L2 penalty and elastic net penalty). Not using `cv.glmnet` may cause other function for `gpe` objects to fail. See `gpe_cv.glmnet`. `model` `TRUE` if the `data` should added to the returned object.

## Details

Provides a more general framework for making a sparse prediction ensemble than `pre`.

By default, a similar fit to `pre` is obtained. In addition, multivariate adaptive regression splines (Friedman, 1991) can be included with `gpe_earth`. See examples.

Other customs base learners can be implemented. See `gpe_trees`, `gpe_linear` or `gpe_earth` for details of the setup. The sampling function given by `sample_func` can also be replaced by a custom sampling function. See `gpe_sample` for details of the setup.

## Value

An object of class `gpe`.

## References

Friedman, J. H., & Popescu, B. E. (2008). Predictive learning via rule ensembles. The Annals of Applied Statistics, 2(3), 916-954. Friedman, J. H. (1991). Multivariate adaptive regression splines. The Annals of Statistics, 19(1), 1-67.

`pre`, `gpe_trees`, `gpe_linear`, `gpe_earth`, `gpe_sample`, `gpe_cv.glmnet`
 ``` 1 2 3 4 5 6 7 8 9 10 11``` ```## Not run: ## Obtain similar fit to \code{\link{pre}}: gpe.rules <- gpe(Ozone ~ ., data = airquality[complete.cases(airquality),], base_learners = list(gpe_linear(), gpe_trees())) gpe.rules ## Also include products of hinge functions using MARS: gpe.hinge <- gpe(Ozone ~ ., data = airquality[complete.cases(airquality),], base_learners = list(gpe_linear(), gpe_trees(), gpe_earth())) ## End(Not run) ```