rule_single: Outlying univariate continuous association rule finder
In Laurae2/Laurae: Advanced High Performance Data Science Toolbox for R

Description Usage Arguments Value Examples

This function allows you to search for association rules on outlying univariate continuous features against a binary label. The predicted label is 0, and the overfitting severity is very high (see: Kaggle's Santander Customer Satisfaction competition). It can be used to score outliers first, then make rules afterwards if needed. Verbosity is automatic and cannot be removed. In case you need this function without verbosity, please compile the package after removing verbose messages.

1
2
3

rule_single(data, label, train_rows = length(label), iterations = 1000,
  minimal_score = 25, minimal_node = 5, false_negatives = 2,
  seed = 11111, scoring = TRUE, ruling = TRUE)

`data`	The data.frame containing the features to make association rules on, or the scoring matrix. Missing values are not allowed.
`label`	The target label as an integer vector (each value must be either 0 or 1). 1 must be the miniority label.
`train_rows`	The rows used for training the association rules. Must be your training set, whose length is equal to `length(labels)`. Defaults to `length(label)`.
`iterations`	The amount of iterations allowed for limited-memory Gradient Descent
`minimal_score`	The association rule finder will not accept any node under the allowed outlying score. Defaults to `25`.
`minimal_node`	The association rule finder will not accept any node containing under that specific amount of samples. Defaults to `5`.
`false_negatives`	The association rule will allow at most (`false_negatives - 1`) false negatives. A higher allows a more permissive algorithm, lower makes it very difficult to converge (or to find any rule at all). Defaults to `2`.
`seed`	The random seed for reproducibility. Defaults to `11111`.
`scoring`	Whether to score features before computing the association rules. Defaults to `TRUE`.
`ruling`	Whether to rule features (useful when you only want the scores). Defaults to `TRUE`.

A list with one to three elements: "scores" the outlying scores for features, "parsed_scores" for the association rule result on specific features, and "output" for the association rule general result per observation.

## Not run: 
scored_data <- rule_single(data = data, label = NA, scoring = TRUE, ruling = FALSE)
rules <- rule_single(data = scored_data, label = target,
iterations = 100, scoring = FALSE, ruling = TRUE)
preds <- preds[rules$output[(length(target)+1):(nrow(data))] == 0] <- 0

## End(Not run)