ruleInduction: Association Rule Induction from Itemsets

ruleInductionR Documentation

Association Rule Induction from Itemsets

Description

Provides the generic function ruleInduction() and the method to induce all association rules which can be generated by the given set of itemsets from a transactions dataset.

Usage

ruleInduction(x, ...)

## S4 method for signature 'itemsets'
ruleInduction(
  x,
  transactions = NULL,
  confidence = 0.8,
  method = c("ptree", "apriori"),
  reduce = FALSE,
  verbose = FALSE,
  ...
)

Arguments

x

the set of itemsets from which rules will be induced.

...

further arguments.

transactions

the transactions used to mine the itemsets. Can be omitted for method "ptree", if x contains a (complete set) of frequent itemsets together with their support counts.

confidence

a numeric value between 0 and 1 giving the minimum confidence threshold for the rules.

method

"ptree" or "apriori"

reduce

remove unused items to speed up the counting process?

verbose

report progress?

Details

All rules that can be created using the supplied itemsets and that surpass the specified minimum confidence threshold are returned. ruleInduction() can be used to produce closed association rules defined by Pei et al. (2000) as rules ⁠X => Y⁠ where both X and Y are closed frequent itemsets. See the code example in the Example section.

Rule induction implements several induction methods. The default method is "ptree"

  • "ptree" method without transactions: No transactions are need to be specified if x contains a complete set of frequent or itemsets. The itemsets' support counts are stored in a ptree and then retrieved to create rules and calculate rules confidence. This is very fast, but fails because of missing support values if x is not a complete set of frequent itemsets.

  • "ptree" method with transactions: If transactions are specified then all transactions are counted into a prefix tree and later retrieved to create rules from the itemsets and calculate confidence values. This is slower, but necessary if x is not a complete set of frequent itemsets. To improve speed, unused items are removed from the transaction data before creating the prefix tree (this behavior can be changed using the argument reduce). This might be slower for large transaction data sets. However, this is highly recommended as the items are also reordered to reduce the counting time.

  • "apriori" method (always needs transactions): All association rules are mined from the transactions data set using apriori() with the smallest support found in the itemsets. In a second step, all rules which cannot be generated from one of the itemsets are removed. This procedure is very slow, especially for itemsets with many elements or very low support.

Value

An object of class rules.

Author(s)

Christian Buchta and Michael Hahsler

References

Michael Hahsler, Christian Buchta, and Kurt Hornik. Selective association rule generation. Computational Statistics, 23(2):303-315, April 2008.

Jian Pei, Jiawei Han, Runying Mao. CLOSET: An Efficient Algorithm for Mining Frequent Closed Itemsets. ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery (DMKD 2000).

See Also

Other mining algorithms: APappearance-class, AScontrol-classes, ASparameter-classes, apriori(), eclat(), fim4r(), weclat()

Examples

data("Adult")

## find all closed frequent itemsets
closed_is <- apriori(Adult, target = "closed frequent itemsets", support = 0.4)
closed_is

## use rule induction to produce all closed association rules
closed_rules <- ruleInduction(closed_is, transactions = Adult, verbose = TRUE)

## inspect the resulting closed rules
summary(closed_rules)
inspect(head(closed_rules, by = "lift"))

## get rules from frequent itemsets. Here, transactions does not need to be
## specified for rule induction.
frequent_is  <- eclat(Adult, support = 0.4)
assoc_rules <- ruleInduction(frequent_is)
assoc_rules
inspect(head(assoc_rules))

## for itemsets that are not a complete set of frequent itemsets,
## transactions need to be specified.
some_is <- sample(frequent_is, 10)
some_rules <- ruleInduction(some_is, transactions = Adult)
some_rules

mhahsler/arules documentation built on March 19, 2024, 5:45 p.m.