RCAR: Regularized Class Association Rules for Multi-class Problems...
In ianjjohnson/arulesCBA: Classification Based on Association Rules

View source: R/RCAR.R

RCAR	R Documentation

Regularized Class Association Rules for Multi-class Problems (RCAR+)

Description

Build a classifier based on association rules mined for an input dataset and weighted with LASSO regularized logistic regression following RCAR (Azmi, et al., 2019). RCAR+ extends RCAR from a binary classifier to a multi-class classifier and can use support-balanced CARs.

Usage

RCAR(
  formula,
  data,
  lambda = NULL,
  alpha = 1,
  glmnet.args = NULL,
  cv.glmnet.args = NULL,
  parameter = NULL,
  control = NULL,
  balanceSupport = FALSE,
  disc.method = "mdlp",
  verbose = FALSE,
  ...
)

Arguments

`formula`	A symbolic description of the model to be fitted. Has to be of form `class ~ .` or `class ~ predictor1 + predictor2`.
`data`	A data.frame containing the training data.
`lambda`	The amount of weight given to regularization during the logistic regression learning process. If not specified (`NULL`) then cross-validation is used to determine the best value (see Details section).
`alpha`	The elastic net mixing parameter. `alpha = 1` is the lasso penalty (default RCAR), and `alpha = 0` the ridge penalty.
`cv.glmnet.args, glmnet.args`	A list of arguments passed on to `cv.glmnet` and `glmnet`, respectively. See Example section.
`parameter, control`	Optional parameter and control lists for apriori.
`balanceSupport`	balanceSupport parameter passed to `mineCARs` function.
`disc.method`	Discretization method for factorizing numeric input (default: `"mdlp"`). See `discretizeDF.supervised` for more supervised discretization methods.
`verbose`	Report progress?
`...`	For convenience, additional parameters are used to create the `parameter` control list for apriori (e.g., to specify the support and confidence thresholds).

Details

RCAR+ extends RCAR from a binary classifier to a multi-class classifier using regularized multinomial logistic regression via glmnet.

If lambda is not specified (NULL) then cross-validation with the largest value of lambda such that error is within 1 standard error of the minimum is used to determine the best value (see cv.glmnet).

See cv.glmnet for performing cross-validation in parallel.

Value

Returns an object of class CBA representing the trained classifier with the additional field model containing a list with the following elements:

`all_rules`	all rules used to build the classifier, including the rules with a weight of zero.
`reg_model`	them multinomial logistic regression model as an object of class `glmnet`.
`cv`	contains the results for the cross-validation used determine lambda.

Author(s)

Tyler Giallanza and Michael Hahsler

References

M. Azmi, G.C. Runger, and A. Berrado (2019). Interpretable regularized class association rules algorithm for classification in a categorical data space. Information Sciences, Volume 483, May 2019. Pages 313-331.

Examples


data("iris")

classifier <- RCAR(Species~., iris)
classifier

# inspect the rule base sorted by the larges class weight
inspect(sort(rules(classifier), by = "weight"))

# make predictions for the first few instances of iris
predict(classifier, head(iris))

# inspecting the regression model, plot the regularization path, and
# plot the cross-validation results to determine lambda
str(classifier$model$reg_model)
plot(classifier$model$reg_model)
plot(classifier$model$cv)

# show progress report and use 5 instead of the default 10 cross-validation folds.
classifier <- RCAR(Species~., iris, cv.glmnet.args = list(nfolds = 5), verbose = TRUE)

ianjjohnson/arulesCBA documentation built on June 13, 2022, 2:07 p.m.