transactions-class | R Documentation |
The transactions
class is a subclass of itemMatrix and
represents transaction data used for mining associations.
transactions(
x,
itemLabels = NULL,
transactionInfo = NULL,
format = "wide",
cols = NULL
)
## S4 method for signature 'transactions'
summary(object)
## S4 method for signature 'transactions'
toLongFormat(from, cols = c("TID", "item"), decode = TRUE)
## S4 method for signature 'transactions'
items(x)
transactionInfo(x)
## S4 method for signature 'transactions'
transactionInfo(x)
transactionInfo(x) <- value
## S4 replacement method for signature 'transactions'
transactionInfo(x) <- value
## S4 method for signature 'transactions'
dimnames(x)
## S4 replacement method for signature 'transactions,list'
dimnames(x) <- value
x , object , from |
the object |
itemLabels |
a vector with labels for the items |
transactionInfo |
a transaction information data.frame with one row per transaction. |
format |
|
cols |
a numeric or character vector of length two giving the index or names of the columns (fields) with the transaction and item ids in the long format. |
decode |
translate item IDs to item labels? |
value |
replacement value |
Transactions store the presence of items in each individual transaction
as binary matrix where rows represent the transactions and columns represent the items.
transactions
direct extends class itemMatrix
to store the sparse binary incidence matrix, item labels, and optionally transaction
IDs and user IDs. If you work with several transaction sets at the
same time, then the encoding (order of the items in the binary matrix) in
the different sets is important. See itemCoding to learn how
to encode and recode transaction sets.
Data Preparation
Data typically starts as a data.frame or a matrix and needs to be
prepared before it can be converted into transactions
(see coercion methods in
the Methods Section and the Example Section below for details on the needed
format).
Columns need to represent items which is different depending on the data type of the column:
Continuous variables: Continuous variables cannot directly be represented as
items and need to be
discretized first. An item resulting from discretization might be
age>18
and the column contains only TRUE
or FALSE
.
Alternatively, it can be a factor with levels age<=18
,
50=>age>18
and age>50
. These will be automatically converted
into 3 items, one for each level. Discretization is described in functions
discretize()
and discretizeDF()
.
Logical variables: A logical variable describing a person could be
tall
indicating if the person is tall using the values TRUE
and FALSE
. The fact that the person is tall would be encoded in the
transaction containing the item tall
while not tall persons would not
have this item. Therefore, for logical variables, the TRUE
value is
converted into an item with the name of the variable and for the
FALSE
values no item is created.
Factors: Columns with nominal values
(i.e., factor, ordered) are translated into a series of binary items (one for each level
constructed as variable name = level
). Items cannot represent order and this ordered factors
lose the order information. Note that nominal variables
need to be encoded as factors (and not characters or numbers). This can be
done with
data[,"a_nominal_var"] <- factor(data[,"a_nominal_var"])
.
Complete examples for how to prepare data can be found in the man pages for Income and Adult.
summary(transactions)
: produce a summary
toLongFormat(transactions)
: convert the transactions to long format
(a data.frame with two columns, tid and item). Column names can
be specified as a character vector of length 2 called cols
.
items(transactions)
: get the transactions as an itemMatrix
transactionInfo(transactions)
: get the transaction info data.frame
transactionInfo(transactions) <- value
: replace the transaction info data.frame
dimnames(transactions)
: get the dimnames
dimnames(x = transactions) <- value
: set the dimnames
Slots are inherited from itemMatrix.
Objects are created by:
coercion from objects of other classes. itemLabels
and transactionInfo
are
by default created from information in x
(e.g., from row and column names).
the constructor function transactions()
by calling new("transactions", ...)
.
See Examples Section for creating transactions from data.
as("transactions", "matrix")
as("matrix", "transactions")
as("list", "transactions")
as("transactions", "list")
as("data.frame", "transactions")
as("transactions", "data.frame")
as("ngCMatrix", "transactions")
Michael Hahsler
Superclass: itemMatrix
Other itemMatrix and transactions functions:
abbreviate()
,
c()
,
crossTable()
,
duplicated()
,
extract
,
hierarchy
,
image()
,
inspect()
,
is.superset()
,
itemFrequency()
,
itemFrequencyPlot()
,
itemMatrix-class
,
match()
,
merge()
,
random.transactions()
,
sample()
,
sets
,
size()
,
supportingTransactions()
,
tidLists-class
,
unique()
## Example 1: creating transactions form a list (each element is a transaction)
a_list <- list(
c("a","b","c"),
c("a","b"),
c("a","b","d"),
c("c","e"),
c("a","b","d","e")
)
## Set transaction names
names(a_list) <- paste("Tr", c(1:5), sep = "")
a_list
## Use the constructor to create transactions
## Note: S4 coercion does the same trans1 <- as(a_list, "transactions")
trans1 <- transactions(a_list)
trans1
## Analyze the transactions
summary(trans1)
image(trans1)
## Example 2: creating transactions from a 0-1 matrix with 5 transactions (rows) and
## 5 items (columns)
a_matrix <- matrix(
c(1, 1, 1, 0, 0,
1, 1, 0, 0, 0,
1, 1, 0, 1, 0,
0, 0, 1, 0, 1,
1, 1, 0, 1, 1), ncol = 5)
## Set item names (columns) and transaction labels (rows)
colnames(a_matrix) <- c("a", "b", "c", "d", "e")
rownames(a_matrix) <- paste("Tr", c(1:5), sep = "")
a_matrix
## Create transactions
trans2 <- transactions(a_matrix)
trans2
inspect(trans2)
## Example 3: creating transactions from data.frame (wide format)
a_df <- data.frame(
age = as.factor(c( 6, 8, NA, 9, 16)),
grade = as.factor(c("A", "C", "F", NA, "C")),
pass = c(TRUE, TRUE, FALSE, TRUE, TRUE))
## Note: factors are translated differently than logicals and NAs are ignored
a_df
## Create transactions
trans3 <- transactions(a_df)
inspect(trans3)
## Note that coercing the transactions back to a data.frame does not recreate the
## original data.frame, but represents the transactions as sets of items
as(trans3, "data.frame")
## Example 4: creating transactions from a data.frame with
## transaction IDs and items (long format)
a_df3 <- data.frame(
TID = c( 1, 1, 2, 2, 2, 3 ),
item = c("a", "b", "a", "b", "c", "b")
)
a_df3
trans4 <- transactions(a_df3, format = "long", cols = c("TID", "item"))
trans4
inspect(trans4)
## convert transactions back into long format.
toLongFormat(trans4)
## Example 5: create transactions from a dataset with numeric variables
## using discretization.
data(iris)
irisDisc <- discretizeDF(iris)
head(irisDisc)
trans5 <- transactions(irisDisc)
trans5
inspect(head(trans5))
## Note, creating transactions without discretizing numeric variables will apply the
## default discretization and also create a warning.
## Example 6: create transactions manually (with the same item coding as in trans5)
trans6 <- transactions(
list(
c("Sepal.Length=[4.3,5.4)", "Species=setosa"),
c("Sepal.Length=[4.3,5.4)", "Species=setosa")
), itemLabels = trans5)
trans6
inspect(trans6)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.