MadelonD: Pre-discretised Madelon dataset

MadelonDR Documentation

Pre-discretised Madelon dataset


Madelon is a synthetic data set from the NIPS 2003 feature selection challenge, generated by Isabelle Guyon. It contains 480 irrelevant and 20 relevant features, including 5 informative and 15 redundant. In this version, the originally numerical features have been pre-cut into 10 bins, as well as their names have been altered to reveal 20 relevant features (as identified by the Boruta method).




A list with two elements, X containing a data frame with predictors, and Y, the decision. Features are in the same order as in the original data; the names of relevant ones start with Rel, while of irrelevant ones with Irr.


praznik documentation built on May 20, 2022, 5:06 p.m.