Description Usage Arguments Value
Generate automatically new features based on older ones for further modelling, using SAFE algoritm proposed in a paper by Shi, Zhang, Li, Yang and Zhou. This is a direct implementation of the pseudo-algoritm proposed in the paper, with its conventions, denotements and flaws.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
X_train |
Matrix - data used to train model. Must be numerical. |
y_train |
Factor - labels for training data. Must be binary. |
X_valid |
Matrix - data used to test model. Must be numerical. |
y_valid |
Factor - labels for testing data. Must be binary. |
operators |
A |
n_iter |
Integer; Amount of iterations for the alghoritm to perform. |
nrounds |
Integer for |
alpha |
Threshold for |
gamma |
Integer; Amount of most important feature combinations to be selected in each iteration. |
bins |
Integer; amount of bins to create to discretize features. |
theta |
Threshold for Pearson's correlation. Features with correlation above theta will be dropped. |
beta |
Integer; Maximum amount of features to be selected at the end of each loop. Set to |
A list
with 2 elements: X_train
and X_test
.
Both contain transformed train and test sets, ready for further modelling.
Unfortunately, this is in contrary to algoritm mentioned in the paper (which returns a function) - at least for now.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.