XSim2008: Bi-clustering using Co-Similarity (version in 2008)

Description Usage Arguments Value References Examples

Description

The χ-Sim algorithm is a co-similarity based approach which builds on the idea of simultaneously generating the similarity matrices SR (between documents) and SC (between words), each of them built on the basis of the other. This method is to implement the algorithm X-SIM in 2008 of Gilles Bission et Feward Sye Hussain. See in the Reference.

Usage

1
XSim2008(x, y, itr = 4)

Arguments

x

matrix or dataframe of predictors, of dimension n*p; each row is an observation vector.

y

response variable (1 or 2)

itr

number of iterations.

Value

Return a list of objects.

References

Bisson G, Hussain F. X-sim - A new similarity measure for the co-clustering task. 7th IEEE International Conference on Machine Learning and Applications (ICMLA), 11-13th Dec. 2008, San Diego, United States.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
library(PDI2015)
# Test with Colon Cancer dataset
res = XSim2008(ColonCancer[, -1], ColonCancer[,1], itr = 4)

# Test News Groups dataset
res1 = XSim2008(NG20SMI$M2[[1]],  NG20SMI$M2[["class"]],  itr = 4)
res2 = XSim2008(NG20SMI$M2[[2]],  NG20SMI$M2[["class"]],  itr = 4)
res3 = XSim2008(NG20SMI$M5[[1]],  NG20SMI$M5[["class"]],  itr = 4)
res4 = XSim2008(NG20SMI$M10[[1]], NG20SMI$M10[["class"]], itr = 4)
res5 = XSim2008(NG20SMI$NG1[[1]], NG20SMI$NG1[["class"]], itr = 4)
res6 = XSim2008(NG20SMI$NG2[[1]], NG20SMI$NG2[["class"]], itr = 4)
res7 = XSim2008(NG20SMI$NG3[[1]], NG20SMI$NG3[["class"]], itr = 4)
# Exemple2: Lung Cancer (Gordon et al.2002)
tmp = gordon$x
idx = c()
dem = 1
for (i in 1:dim(tmp)[2]) {
 if (  (abs(max(tmp[,i])/min(tmp[,i])) < 5) || (abs(max(tmp[,i]) - min(tmp[,i])) < 600)) {
   idx[dem] = i
   dem = dem + 1
 }
}

tmp2 = tmp[,-idx]
dim(tmp2)

classs = as.numeric(gordon$y)

# Test 4 algo:
# 1 .XSIM 2008
tt1 = c()
for (i in 1:4){
 tmp = XSim2008(tmp2, classs, itr = i)
 tt1[i] = tmp$accuracy
}

# 4. XSIM.mod of Hussain
tt4 = c()
for (i in 1:4) {
 tmp = XSim.mod(tmp2, classs, itr = i)
 tt4[i] = tmp$accuracy
}

# Comparaison
max(tt1)
max(tt4)
# Exemple 3: Colon Cancer (Alon et al.1999):
tmp = projectPDI2015::ColonCancer[,-1]
idx = c()
dem = 1
for (i in 1:dim(tmp)[2]) {
 if (  (abs(max(tmp[,i])/min(tmp[,i])) < 15) || (abs(max(tmp[,i]) - min(tmp[,i])) < 500)) {
  idx[dem] = i
  dem = dem + 1
}
}

tmp2 = tmp[,-idx]
dim(tmp2)


# Test 4 algo:
# 1 .XSIM 2008
tt1 = c()
for (i in 1:4){
tmp = XSim2008(tmp2, ColonCancer[,1], itr = i)
tt1[i] = tmp$accuracy
}

# 2. XSIM 2010
tt2 = c()
for (i in 1:4){
tmp = XSim2010(tmp2, ColonCancer[,1], itr = i)
tt2[i] = tmp$accuracy
}

# 3. XSIM 2015
tt3 = c()
for (i in 1:4){
tmp = XSim2015(tmp2, ColonCancer[,1], itr = i)
 tt3[i] = tmp$accuracy
}

# 4. XSIM.mod of Hussain
tt4 = c()
for (i in 1:4) {
 tmp = XSim.mod(tmp2, ColonCancer[,1], itr = i)
 tt4[i] = tmp$accuracy
}

# Comparaison
max(tt1)
# max(tt2)
max(tt3)
max(tt4)

daosang/PDI2015 documentation built on May 14, 2019, 6:07 p.m.