An implementation of the RuleFit algorithm as described in Friedman & Popescu (2008) <doi:10.1214/07AOAS148>. eXtreme Gradient Boosting ('XGBoost') is used to build rules, and 'glmnet' is used to fit a sparse linear model on the raw and rule features. The result is a model that learns similarly to a tree ensemble, while often offering improved interpretability and achieving improved scoring runtime in live applications. Several algorithms for reducing rule complexity are provided, most notably hyperrectangle deoverlapping. All algorithms scale to several million rows and support sparse representations to handle tens of thousands of dimensions.
