itemMatrix-class: Class itemMatrix - Sparse Binary Incidence Matrix to...

Description Details Objects from the Class Slots Methods Author(s) See Also Examples

Description

The itemMatrix class is the basic building block for transactions, itemsets and rules in package arules. The class contains a sparse Matrix representation of items (a set of itemsets or transactions) and the corresponding item labels.

Details

Sets of itemsets are represented as sparse binary matrices. If you work with several itemMatrices at the same time (e.g., several transaction sets, lhs and rhs of a rule, etc.), then the encoding (itemLabes and order of the items in the binary matrix) in the different itemMatrices is important and needs to conform. See itemCoding to learn how to encode and recode itemMatrix objects.

Objects from the Class

Objects can be created by calls of the form new("itemMatrix", ...). However, most of the time objects will be created by coercion from a matrix, list or data.frame.

Slots

data:

Object of class ngCMatrix (from package Matrix) which stores item occurrences in sparse representation. Note that the ngCMatrix is column-oriented and itemMatrix is row-oriented with each row representing an element (an itemset, a transaction, etc.). As a result, the ngCMatrix in this slot is always a transposed version of the binary incidence matrix in itemMatrix.

itemInfo:

a data.frame which contains named vectors of the length equal to the number of elements in the set. If the slot is not empty (contains no item labels), the first element in the data.frame must have the name "labels" and contain a character vector with the item labels used for representing an item. In addition to the item labels, the data.frame can contain arbitrary named vectors (of the same length) to represent, e.g., variable names and values which were used to create the binary items or hierarchical category information associated with each item label.

itemsetInfo:

a data.frame which may contain additional information for the rows (mostly representing itemsets) in the matrix.

Methods

coerce

signature(from = "matrix", to = "itemMatrix"); expects from to be a binary matrix only containing 0s and 1s.

coerce

signature(from = "itemMatrix", to = "matrix"); coerces to a dense 0-1 matrix of storage.mode "integer" instead of "double" to save memory.

coerce

signature(from = "list", to = "itemMatrix"); from is a list of vectors. Each vector contains one set/transaction/....

coerce

signature(from = "itemMatrix", to = "list"); see also the methods for LIST.

coerce

signature(from = "itemMatrix", to = "ngCMatrix"); access the sparse matrix representation. Note, the ngCMatrix contains a transposed from of the itemMatrix.

coerce

signature(from = "ngCMatrix", to = "itemMatrix"); Note, the ngCMatrix has to be transposed with items as rows!

c

signature(object = "itemMatrix"); combine.

dim

signature(x = "itemMatrix"); returns the dimensions of the itemMatrix.

dimnames, rownames, colnames

signature(x = "itemMatrix"); returns row (itemsetID) and column (item) names.

dimnames

signature(x = "itemMatrix"); returns dimnames.

dimnames<-

signature(x = "itemMatrix", value = "list"); replace dimnames.

%in%

signature(x = "itemMatrix", table = "character"); matches the strings in table against the item labels in x and returns a logical vector indicating if a row (itemset) in x contains any of the items specified in table. Note that there is a %in% method with signature(x = "itemMatrix", table = "character"). This method is described in together with match.

%ain%

signature(x = "itemMatrix", table = "character"); matches the strings in table against the item labels in x and returns a logical vector indicating if a row (itemset) in x contains all of the items specified in table.

%oin%

signature(x = "itemMatrix", table = "character"); matches the strings in table against the item labels in x and returns a logical vector indicating if a row (itemset) in x contains only items specified in table.

%pin%

signature(x = "itemMatrix", table = "character"); matches the strings in table against the item labels in x (using partial matching) and returns a logical vector indicating if a row (itemset) in x contains any of the items specified in table.

itemLabels

signature(object = "itemMatrix"); returns the item labels used for encoding as a character vector.

itemLabels<-

signature(object = "itemMatrix"); replaces the item labels used for encoding.

itemInfo

signature(object = "itemMatrix"); returns the whole item/column information data.frame including labels.

itemInfo<-

signature(object = "itemMatrix"); replaces the item/column info by a data.frame.

itemsetInfo

signature(object = "itemMatrix"); returns the item set/row information data.frame.

itemsetInfo<-

signature(object = "itemMatrix"); replaces the item set/row info by a data.frame.

labels

signature(x = "transactions"); returns labels for the itemsets. The following arguments can be used to customize the representation of the labels: itemSep, setStart and setEnd.

nitems

signature(x = "itemMatrix"); returns the number of items (number in columns) in the itemMatrix.

show

signature(object = "itemMatrix")

summary

signature(object = "itemMatrix")

Author(s)

Michael Hahsler

See Also

LIST, c, duplicated, inspect, is.subset, is.superset, itemFrequency, itemFrequencyPlot, itemCoding, match, length, sets, subset, unique, [-methods, image, ngCMatrix-class (from Matrix), transactions-class, itemsets-class, rules-class

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
set.seed(1234)
  
## Generate random data and coerce data to itemMatrix.
m <- matrix(runif(100000)>0.8, ncol=20)
dimnames(m) <- list(NULL, paste("item", c(1:20), sep=""))
i <- as(m, "itemMatrix")

## Get the number of elements (rows) in the itemMatrix.
length(i)

## Get first 5 elements (rows) of the itemMatrix as list.
as(i[1:5], "list")

## Get first 5 elements (rows) of the itemMatrix as matrix.
as(i[1:5], "matrix")

## Get first 5 elements (rows) of the itemMatrix as sparse ngCMatrix.
## Warning: for efficiency reasons, the ngCMatrix you get is transposed!
as(i[1:5], "ngCMatrix")

## Get labels for the first 5 itemsets (first default and then with 
## custom formating)
labels(i[1:5])
labels(i[1:5], itemSep = " + ", setStart = "", setEnd = "")

## create itemsets from itemMatrix  
is <- new("itemsets", items = i[1:3])
inspect(is)

## create rules (rhs and lhs cannot share items so I use 
## itemSetdiff here). Also assign (random) support.
rules <- new("rules", lhs=itemSetdiff(i[4:6],i[1:3]), rhs=i[1:3],
  quality = data.frame(support = runif(3)))
inspect(rules) 

Example output

Loading required package: Matrix

Attaching package: 'arules'

The following objects are masked from 'package:base':

    abbreviate, write

[1] 5000
$`1`
[1] "item7"  "item10" "item13" "item15" "item18" "item20"

$`2`
[1] "item5"  "item7"  "item17"

$`3`
[1] "item15"

$`4`
[1] "item3" "item8"

$`5`
[1] "item1"  "item8"  "item20"

  item1 item2 item3 item4 item5 item6 item7 item8 item9 item10 item11 item12
1 FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE   TRUE  FALSE  FALSE
2 FALSE FALSE FALSE FALSE  TRUE FALSE  TRUE FALSE FALSE  FALSE  FALSE  FALSE
3 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  FALSE  FALSE  FALSE
4 FALSE FALSE  TRUE FALSE FALSE FALSE FALSE  TRUE FALSE  FALSE  FALSE  FALSE
5  TRUE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE  FALSE  FALSE  FALSE
  item13 item14 item15 item16 item17 item18 item19 item20
1   TRUE  FALSE   TRUE  FALSE  FALSE   TRUE  FALSE   TRUE
2  FALSE  FALSE  FALSE  FALSE   TRUE  FALSE  FALSE  FALSE
3  FALSE  FALSE   TRUE  FALSE  FALSE  FALSE  FALSE  FALSE
4  FALSE  FALSE  FALSE  FALSE  FALSE  FALSE  FALSE  FALSE
5  FALSE  FALSE  FALSE  FALSE  FALSE  FALSE  FALSE   TRUE
20 x 5 sparse Matrix of class "ngCMatrix"
       1 2 3 4 5
item1  . . . . |
item2  . . . . .
item3  . . . | .
item4  . . . . .
item5  . | . . .
item6  . . . . .
item7  | | . . .
item8  . . . | |
item9  . . . . .
item10 | . . . .
item11 . . . . .
item12 . . . . .
item13 | . . . .
item14 . . . . .
item15 | . | . .
item16 . . . . .
item17 . | . . .
item18 | . . . .
item19 . . . . .
item20 | . . . |
[1] "{item7,item10,item13,item15,item18,item20}"
[2] "{item5,item7,item17}"                      
[3] "{item15}"                                  
[4] "{item3,item8}"                             
[5] "{item1,item8,item20}"                      
[1] "item7 + item10 + item13 + item15 + item18 + item20"
[2] "item5 + item7 + item17"                            
[3] "item15"                                            
[4] "item3 + item8"                                     
[5] "item1 + item8 + item20"                            
    items                                     
[1] {item7,item10,item13,item15,item18,item20}
[2] {item5,item7,item17}                      
[3] {item15}                                  
    lhs         rhs        support
[1] {item3,                       
     item8}  => {item7,           
                 item10,          
                 item13,          
                 item15,          
                 item18,          
                 item20} 0.6881455
[2] {item1,                       
     item8,                       
     item20} => {item5,           
                 item7,           
                 item17} 0.2000279
[3] {item9,                       
     item17,                      
     item18} => {item15} 0.5099691

arules documentation built on Nov. 17, 2017, 6:02 a.m.