README.md

rFerns CRAN downloads Build Status Build Status

rFerns is an extended random ferns implementation for R; in comparison to original, it can handle standard information system containing both categorical and continuous attributes. Moreover, it generates OOB error approximation and permutation-based attribute importance measure similar to randomForest. Here is a paper with all the details.

rFerns is good for doing training fast and in predictable time; in general it is less accurate than Random Forest, yet not substantially, and obviously there are cases in which it is better. It is also nice as a very fast variable importance source; in fact it was created to speed-up the Boruta all relevant feature selector, and it did pretty well. Finally, it is a very stochastic method, practically doing no optimisation at all; basically it is crazy that it works. Hence, it is theoretically interesting (;

Since v2.0.1, it supports merging of rFerns models, making it possible to implement adaptive ensemble size or something like online learning.

Since v2.0.0, it can do shadow importance, i.e., a heuristic way to reason about the significance of importance scores. See here for details.

Since v0.3.2, it can do multi-label classification as well; here is a conference paper about that (arXiv version).

There is also a Spark version (not mine), Sparkling Ferns; also this.

How to use

Quite fresh version should be on CRAN; see the R docs for more.

If you want to use it / test it apart from R, it is quite possible -- consult side_src/test.c to see how this may work. Yet don't expect that this will ever become a standalone library.



mbq/rFerns documentation built on May 22, 2019, 12:57 p.m.