rule_double: Outlying bivariate linear continuous association rule finder
In Laurae2/Laurae: Advanced High Performance Data Science Toolbox for R

Description Usage Arguments Value Examples

This function allows you to search for association rules on outlying bivariate linear continuous features against a binary label. The predicted label is 0, and the overfitting severity is very high (see: Kaggle's Santander Customer Satisfaction competition). Unlike the univariate rule finder, it cannot be used to score outliers first (a 300 feature matrix can get to about 9000 features...). Verbosity is automatic and cannot be removed. In case you need this function without verbosity, please compile the package after removing verbose messages.

1
2
3

rule_double(data, label, train_rows = length(label), iterations = 1000,
  minimal_score = 25, minimal_node = 5, false_negatives = 2,
  seed = 11111)

`data`	The data.frame containing the features to make association rules on, or the scoring matrix. Missing values are not allowed.
`label`	The target label as an integer vector (each value must be either 0 or 1). 1 must be the miniority label.
`train_rows`	The rows used for training the association rules. Must be your training set, whose length is equal to `length(labels)`. Defaults to `length(label)`.
`iterations`	The amount of iterations allowed for limited-memory Gradient Descent
`minimal_score`	The association rule finder will not accept any node under the allowed outlying score. Defaults to `25`.
`minimal_node`	The association rule finder will not accept any node containing under that specific amount of samples. Defaults to `5`.
`false_negatives`	The association rule will allow at most (`false_negatives - 1`) false negatives. A higher allows a more permissive algorithm, lower makes it very difficult to converge (or to find any rule at all). Defaults to `2`.
`seed`	The random seed for reproducibility. Defaults to `11111`.

A vector with nrow(data) elements: the general result for each observation using bivariate rules.

## Not run: 
rules <- rule_double(data = scored_data, label = target, iterations = 100)
preds <- preds[rules[(length(target)+1):(nrow(data))] == 0] <- 0

## End(Not run)