mineCARs | R Documentation |
Class Association Rules (CARs) are association rules that have only items with class values in the RHS as introduced for the CBA algorithm by Liu et al., 1998.
mineCARs( formula, transactions, parameter = NULL, control = NULL, balanceSupport = FALSE, verbose = TRUE, ... )
formula |
A symbolic description of the model to be fitted. |
transactions |
An object of class |
parameter, control |
Optional parameter and control lists for the
|
balanceSupport |
logical; if |
verbose |
logical; report progress? |
... |
For convenience, the mining parameters for apriori can be
specified as .... Examples are the |
Class association rules (CARs) are of the form
P => c_i,
where the LHS P is a pattern (i.e., an itemset) and c_i is a single items representing the class label.
Mining parameters. Mining parameters for
apriori
can be either specified as a list (or object
of APparameter
) as argument parameter
or, for
convenience, as arguments in ...
. Note: mineCARs
uses
by default a minimum support of 0.1 (for the LHS of the rules via parameter
originalSupport = FALSE
),
a minimum confidence of 0.5 and a maxlen
(rule
length including items in the LHS and RHS) of 5.
Balancing minimum support. Using a single minimum support threshold
for a highly class imbalanced dataset will lead to the problem, that
minority classes will only be presented in very few rules. To address this
issue, balanceSupport = TRUE
can be used to adjust minimum support
for each class dependent on the prevalence of the class (i.e., the frequency
of the c_i in the transactions) similar to the minimum class support
suggested for CBA by Liu et al (2000) we use
minsupp_i = minsupp_t x supp(c_i)/max(supp(C)),
where max(supp(C)) is the support of the majority class. Therefore, the defined minimum support is used for the majority class and then minimum support is scaled down for classes which are less prevalent, giving them a chance to also produce a reasonable amount of rules. In addition, a named numerical vector with a support values for each class can be specified.
Returns an object of class rules
.
Michael Hahsler
Liu, B. Hsu, W. and Ma, Y (1998). Integrating Classification and Association Rule Mining. KDD'98 Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining, New York, 27-31 August. AAAI. pp. 80-86.
Liu B., Ma Y., Wong C.K. (2000) Improving an Association Rule Based Classifier. In: Zighed D.A., Komorowski J., Zytkow J. (eds) Principles of Data Mining and Knowledge Discovery. PKDD 2000. Lecture Notes in Computer Science, vol 1910. Springer, Berlin, Heidelberg.
apriori
, APparameter
,
rules
, transactions
.
data("iris") # discretize and convert to transactions iris.trans <- prepareTransactions(Species ~ ., iris) # mine CARs with items for "Species" in the RHS. # Note: mineCars uses a default a minimum coverage (lhs support) of 0.1, a # minimum confidence of .5 and maxlen of 5 cars <- mineCARs(Species ~ ., iris.trans) inspect(head(cars)) # specify minimum support and confidence cars <- mineCARs(Species ~ ., iris.trans, parameter = list(support = 0.3, confidence = 0.9, maxlen = 3)) inspect(head(cars)) # for convenience this can also be written without a list for parameter using ... cars <- mineCARs(Species ~ ., iris.trans, support = 0.3, confidence = 0.9, maxlen = 3) # restrict the predictors to items starting with "Sepal" cars <- mineCARs(Species ~ Sepal.Length + Sepal.Width, iris.trans) inspect(cars) # using different support for each class cars <- mineCARs(Species ~ ., iris.trans, balanceSupport = c( "Species=setosa" = 0.1, "Species=versicolor" = 0.5, "Species=virginica" = 0.01), confidence = 0.9) cars # balance support for class imbalance data("Lymphography") Lymphography_trans <- as(Lymphography, "transactions") classFrequency(class ~ ., Lymphography_trans) # mining does not produce CARs for the minority classes cars <- mineCARs(class ~ ., Lymphography_trans, support = .3, maxlen = 3) classFrequency(class ~ ., cars, type = "absolute") # Balance support by reducing the minimum support for minority classes cars <- mineCARs(class ~ ., Lymphography_trans, support = .3, maxlen = 3, balanceSupport = TRUE) classFrequency(class ~ ., cars, type = "absolute") # Mine CARs from regular transactions (a negative class item is automatically added) data(Groceries) cars <- mineCARs(`whole milk` ~ ., Groceries, balanceSupport = TRUE, support = 0.01, confidence = 0.8) inspect(sort(cars, by = "lift"))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.