# gpe_trees: Learner Functions Generators for gpe In pre: Prediction Rule Ensembles

 gpe_trees R Documentation

## Learner Functions Generators for gpe

### Description

Functions to get "learner" functions for `gpe`.

### Usage

```gpe_trees(
...,
remove_duplicates_complements = TRUE,
mtry = Inf,
ntrees = 500,
maxdepth = 3L,
learnrate = 0.01,
parallel = FALSE,
tree.control = ctree_control(mtry = mtry, maxdepth = maxdepth)
)

gpe_linear(..., winsfrac = 0.025, normalize = TRUE)

gpe_earth(
...,
degree = 3,
nk = 8,
normalize = TRUE,
ntrain = 100,
learnrate = 0.1,
cor_thresh = 0.99
)
```

### Arguments

 `...` Currently not used. `remove_duplicates_complements` `TRUE`. Should rules with complementary or duplicate support be removed? `mtry` Number of input variables randomly sampled as candidates at each node for random forest like algorithms. The argument is passed to the tree methods in the `partykit` package. `ntrees` Number of trees to fit. Will not have an effect if `tree.control` is used. `maxdepth` Maximum depth of trees. Will not have an effect if `tree.control` is used. `learnrate` Learning rate for methods. Corresponds to the ν parameter in Friedman & Popescu (2008). `parallel` `TRUE`. Should basis functions be found in parallel? `use_grad` `TRUE`. Should binary outcomes use gradient boosting with regression trees when `learnrate > 0`? That is, use `ctree` instead of `glmtree` as in Friedman (2001) with a second order Taylor expansion instead of first order as in Chen and Guestrin (2016). `tree.control` `ctree_control` with options for the `ctree` function. `winsfrac` Quantile to winsorize linear terms. The value should be in [0,0.5) `normalize` `TRUE`. Should value be scaled by .4 times the inverse standard deviation? If `TRUE`, gives linear terms the same influence as a typical rule. `degree` Maximum degree of interactions in `earth` model. `nk` Maximum number of basis functions in `earth` model. `ntrain` Number of models to fit. `cor_thresh` A threshold on the pairwise correlation for removal of basis functions. This is similar to `remove_duplicates_complements`. One of the basis functions in pairs where the correlation exceeds the threshold is excluded. `NULL` implies no exclusion. Setting a value closer to zero will decrease the time needed to fit the final model.

### Details

`gpe_trees` provides learners for tree method. Either `ctree` or `glmtree` from the `partykit` package will be used.

`gpe_linear` provides linear terms for the `gpe`.

`gpe_earth` provides basis functions where each factor is a hinge function. The model is estimated with `earth`.

### Value

A function that has formal arguments `formula`, `data`, `weights`, `sample_func`, `verbose`, `family`, `...`. The function returns a vector with character where each element is a term for the final formula in the call to `cv.glmnet`

### References

Hothorn, T., & Zeileis, A. (2015). partykit: A modular toolkit for recursive partytioning in R. Journal of Machine Learning Research, 16, 3905-3909.

Friedman, J. H. (1991). Multivariate adaptive regression splines. The Annals Statistics, 19(1), 1-67.

Friedman, J. H. (2001). Greedy function approximation: a gradient boosting machine. The Annals of Applied Statistics, 29(5), 1189-1232.

Friedman, J. H. (1993). Fast MARS. Dept. of Statistics Technical Report No. 110, Stanford University.

Friedman, J. H., & Popescu, B. E. (2008). Predictive learning via rule ensembles. The Annals of Applied Statistics, 2(3), 916-954.

Chen T., & Guestrin C. (2016). Xgboost: A scalable tree boosting system. Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2016.

`gpe`, `rTerm`, `lTerm`, `eTerm`