itemCoding | R Documentation |
The order in which items are stored in an itemMatrix is called the item coding. The following generic functions and methods are used to translate between the representation in the itemMatrix format (used in transactions, rules and itemsets), item labels and numeric item IDs (i.e., the column numbers in the itemMatrix representation).
decode(x, ...)
## S4 method for signature 'numeric'
decode(x, itemLabels)
## S4 method for signature 'list'
decode(x, itemLabels)
encode(x, ...)
## S4 method for signature 'character'
encode(x, itemLabels, itemMatrix = TRUE)
## S4 method for signature 'numeric'
encode(x, itemLabels, itemMatrix = TRUE)
## S4 method for signature 'list'
encode(x, itemLabels, itemMatrix = TRUE)
recode(x, ...)
## S4 method for signature 'itemMatrix'
recode(x, itemLabels = NULL, match = NULL)
## S4 method for signature 'itemsets'
recode(x, itemLabels = NULL, match = NULL)
## S4 method for signature 'rules'
recode(x, itemLabels = NULL, match = NULL)
compatible(x, y)
## S4 method for signature 'itemMatrix'
compatible(x, y)
## S4 method for signature 'associations'
compatible(x, y)
x |
a vector or a list of vectors of character strings (for
|
... |
further arguments. |
itemLabels |
a vector of character strings used for coding where the position of an item label in the vector gives the item's column ID. Alternatively, a itemMatrix, transactions or associations object can be specified and the item labels or these objects are used. |
itemMatrix |
return an object of class itemMatrix otherwise an
object of the same class as |
match |
deprecated: used |
y |
an object of class itemMatrix, transactions or
associations to compare item coding to |
Item coding compatibility: When working with several datasets or different
subsets of the same dataset, combining or compare the found
itemsets or rules requires a compatible item coding.
That is, the sparse matrices representing the
items (the itemMatrix objects) have columns for the same items in exactly the
same order. The coercion to transactions with transactions()
or
as(x, "transactions")
will create the
item coding by adding items in the order they are encountered in the dataset. This
can lead to different item codings (different order, missing items) for even
only slightly different datasets or versions of a dataset.
Method compatible()
can be used to check if two sets have the same item coding.
Defining a common item coding:
When working with many sets, then first a common item
coding should be defined by creating a vector with all possible item labels and then
specify them as itemLabels
to create transactions with transactions()
.
Compatible itemMatrix objects can be created using encode()
.
Recoding and Decoding:
Two incompatible objects can be made compatible using recode()
. Recode
one object by specifying the other object in itemLabels
.
decode()
converts from the column IDs used in the itemMatrix
representation to item labels. decode()
is used by LIST()
.
recode()
always returns an object of the same class as x
.
For encode()
with itemMatrix = TRUE
an object of class
itemMatrix is returned. Otherwise the result is of the same type as
x
, e.g., a list or a vector.
Michael Hahsler
LIST()
, associations, itemMatrix
Other preprocessing:
discretize()
,
hierarchy
,
merge()
,
sample()
data("Adult")
## Example 1: Manual decoding
## Extract the item coding as a vector of item labels.
iLabels <- itemLabels(Adult)
head(iLabels)
## get undecoded list (itemIDs)
list <- LIST(Adult[1:5], decode = FALSE)
list
## decode itemIDs by replacing them with the appropriate item label
decode(list, itemLabels = iLabels)
## Example 2: Manually create an itemMatrix using iLabels as the common item coding
data <- list(
c("income=small", "age=Young"),
c("income=large", "age=Middle-aged")
)
# Option a: encode to match the item coding in Adult
iM <- encode(data, itemLabels = Adult)
iM
inspect(iM)
compatible(iM, Adult)
# Option b: coercion plus recode to make it compatible to Adult
# (note: the coding has 115 item columns after recode)
iM <- as(data, "itemMatrix")
iM
compatible(iM, Adult)
iM <- recode(iM, itemLabels = Adult)
iM
compatible(iM, Adult)
## Example 3: use recode to make itemMatrices compatible
## select first 100 transactions and all education-related items
sub <- Adult[1:100, itemInfo(Adult)$variables == "education"]
itemLabels(sub)
image(sub)
## After choosing only a subset of items (columns), the item coding is now
## no longer compatible with the Adult dataset
compatible(sub, Adult)
## recode to match Adult again
sub.recoded <- recode(sub, itemLabels = Adult)
image(sub.recoded)
## Example 4: manually create 2 new transaction for the Adult data set
## Note: check itemLabels(Adult) to see the available labels for items
twoTransactions <- as(
encode(list(
c("age=Young", "relationship=Unmarried"),
c("age=Senior")
), itemLabels = Adult),
"transactions")
twoTransactions
inspect(twoTransactions)
## the same using the transactions constructor function instead
twoTransactions <- transactions(
list(
c("age=Young", "relationship=Unmarried"),
c("age=Senior")
), itemLabels = Adult)
twoTransactions
inspect(twoTransactions)
## Example 5: Use a common item coding
# Creation of transactions separately will produce different item codings
trans1 <- transactions(
list(
c("age=Young", "relationship=Unmarried"),
c("age=Senior")
))
trans1
trans2 <- transactions(
list(
c("age=Middle-aged", "relationship=Married"),
c("relationship=Unmarried", "age=Young")
))
trans2
compatible(trans1, trans2)
# produce common item coding (all item labels in the two sets)
commonItemLabels <- union(itemLabels(trans1), itemLabels(trans2))
commonItemLabels
trans1 <- recode(trans1, itemLabels = commonItemLabels)
trans1
trans2 <- recode(trans2, itemLabels = commonItemLabels)
trans2
compatible(trans1, trans2)
## Example 6: manually create a rule using the item coding in Adult
## and calculate interest measures
aRule <- new("rules",
lhs = encode(list(c("age=Young", "relationship=Unmarried")),
itemLabels = Adult),
rhs = encode(list(c("income=small")),
itemLabels = Adult)
)
## shorter version using the rules constructor
aRule <- rules(
lhs = list(c("age=Young", "relationship=Unmarried")),
rhs = list(c("income=small")),
itemLabels = Adult
)
quality(aRule) <- interestMeasure(aRule,
measure = c("support", "confidence", "lift"), transactions = Adult)
inspect(aRule)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.