hierarchy | R Documentation |
Functions to use item hierarchies to aggregate items at different group levels, to perform multi-level transaction analysis.
addAggregate(x, by, postfix = "*")
filterAggregate(x)
aggregate(x, ...)
## S4 method for signature 'itemMatrix'
aggregate(x, by)
## S4 method for signature 'itemsets'
aggregate(x, by)
## S4 method for signature 'rules'
aggregate(x, by)
x |
an transactions, itemsets or rules object. |
by |
name of a field (hierarchy level) available in
itemInfo of |
postfix |
characters added to mark group-level items. |
... |
further arguments. |
Often an item hierarchy is available for transactions used for association rule mining. For example in a supermarket dataset items like "bread" and "beagle" might belong to the item group (category) "baked goods."
Transactions can store item hierarchies as additional columns in the
itemInfo data.frame ("labels"
cannot be used since it is reserved for
the item labels).
Aggregation: To perform analysis at a group level of the item
hierarchy, aggregate()
produces a new object with items aggregated to
a given group level. A group-level item is present if one or more of the
items in the group are present in the original object. If rules are
aggregated, and the aggregation would lead to the same aggregated group item
in the lhs and in the rhs, then that group item is removed from the lhs.
Rules or itemsets, which are not unique after the aggregation, are also
removed. Note also that the quality measures are not applicable to the new
rules and thus are removed. If these measures are required, then aggregate
the transactions before mining rules.
Multi-level analysis: To analyze relationships between individual
items and item groups at the same time, addAggregate()
can be used to
create a new transactions object which contains both, the original items and
group-level items (marked with a given postfix). In association rule mining,
all items are handled the same, which means that we will produce a large
number of rules of the type:
item A => group of item A
with a confidence of 1. This will also happen if you mine itemsets.
filterAggregate()
can be used to filter these spurious rules or
itemsets.
aggregate()
returns an object of the same class as x
encoded with a number of items equal to the number of unique values in
by
. Note that for associations (itemsets and rules) the number of
associations in the returned set will most likely be reduced since several
associations might map to the same aggregated association and aggregate
returns a unique set. If several associations map to a single aggregated
association then the quality measures of one of the original associations is
randomly chosen.
addAggregate()
returns a new transactions object with the original
items and the group-items added. filterAggregateRules()
returns a new
rules object with the spurious rules remove.
Michael Hahsler
Other preprocessing:
discretize()
,
itemCoding
,
merge()
,
sample()
Other itemMatrix and transactions functions:
abbreviate()
,
c()
,
crossTable()
,
duplicated()
,
extract
,
image()
,
inspect()
,
is.superset()
,
itemFrequency()
,
itemFrequencyPlot()
,
itemMatrix-class
,
match()
,
merge()
,
random.transactions()
,
sample()
,
sets
,
size()
,
supportingTransactions()
,
tidLists-class
,
transactions-class
,
unique()
data("Groceries")
Groceries
## Groceries contains a hierarchy stored in itemInfo
head(itemInfo(Groceries))
## Example 1: Aggregate items using an existing hierarchy stored in itemInfo.
## We aggregate to level2 stored in Groceries. All items with the same level2 label
## will become a single item with that name.
## Note that the number of items is therefore reduced to 55
Groceries_level2 <- aggregate(Groceries, by = "level2")
Groceries_level2
head(itemInfo(Groceries_level2)) ## labels are alphabetically sorted!
## compare original and aggregated transactions
inspect(head(Groceries, 2))
inspect(head(Groceries_level2, 2))
## Example 2: Aggregate using a character vector.
## We create here labels manually to organize items by their first letter.
mylevels <- toupper(substr(itemLabels(Groceries), 1, 1))
head(mylevels)
Groceries_alpha <- aggregate(Groceries, by = mylevels)
Groceries_alpha
inspect(head(Groceries_alpha, 2))
## Example 3: Aggregate rules
## Note: You could also directly mine rules from aggregated transactions to
## get support, lift and support
rules <- apriori(Groceries, parameter = list(supp = 0.005, conf = 0.5))
rules
inspect(rules[1])
rules_level2 <- aggregate(rules, by = "level2")
inspect(rules_level2[1])
## Example 4: Mine multi-level rules.
## (1) Add aggregate items. These items will have labels ending with a *
Groceries_multilevel <- addAggregate(Groceries, "level2")
summary(Groceries_multilevel)
inspect(head(Groceries_multilevel))
rules <- apriori(Groceries_multilevel,
parameter = list(support = 0.01, conf = .9))
inspect(head(rules, by = "lift"))
## Note that this contains many spurious rules of type 'item X => aggregate of item X'
## with a confidence of 1 and high lift. We can filter spurious rules resulting from
## the aggregation
rules <- filterAggregate(rules)
inspect(head(rules, by = "lift"))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.