Description Usage Arguments Details Value References See Also Examples
Performs multiple kernel K-means clustering on a multi-view data.
1 2 3 4 5 |
K |
N x N x P array containing P kernel matrices with size N x N. |
centers |
The number of clusters, say k. |
iter.max |
The maximum number of iterations allowed. The default is 10. |
epsilon |
Convergence threshold. The default is 10^{-4}. |
theta |
intial values for kernel coefficients. The default is 1/P for all views. |
x |
Object of class inheriting from |
... |
Additional arguments passed to |
The optimization problem is described with an array of the multiple
kernels (K
), the number of clusters (centers
), and the linear
constraint on the kernel coefficient (θ).
Our method put more weight on views having weak signal for cluster information so that we can utilize important, complementary information collected from all the views. That is, the larger unexplained variance in a view is, the larger kernel coefficient θ on the view will be assigned. After combining multiple views, it minimizes un-explained variance in the combined view by optimizing continuous cluster assignments. Discrete cluster assignments are recovered by performing K-means clustering on the (normalized) continuous cluster assignments. The K-means algorithm is performed with 1000 random starts and the best result minimizing the objective function is reported.
We recommend to standardize all the original features to have zero-mean and unit-
variance before processing with kernel functions. The kernel matrices should be
constructed ahead of the algorithm. We recommend to normalized the kernels using
StandardizeKernel
which makes the multiple views are comparable to each
other.
mkkc
returns an object of class "MultipleKernelKmeans
" which has a print
and a coef
method. It is a list with at least the following components:
A vector of integers (from 1:k
) indicating the cluster to which each point is allocated.
The total sum of squares.
Matrix of within-cluster sum of squares by cluster, one row per view.
Vector of within-cluster sum of squares, one component per cluster.
Vector of within-cluster sum of squares, one component per view.
Total within-cluster sum of squares, i.e. sum(withinsscluster)
.
Vector of between-cluster sum of squares, one component per view.
The between-cluster sum of squares, i.e. totss-tot.withinss
.
The number of clusters, say k
.
The kernel coefficients
The continuous clustering assignment
The number of points, one component per cluster.
The number of iterations.
bang2018mkkcMKKC
kernelMatrix
, StandardizeKernel
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | require(kernlab)
# define kernel
rbf <- rbfdot(sigma = 0.5)
# construct kernel matrices
n.noise <- 3
dat1 <- kernelMatrix(rbf, simCnoise$view1[,1:(2 + n.noise)])
dat2 <- kernelMatrix(rbf, simCnoise$view2)
dat3 <- kernelMatrix(rbf, simCnoise$view3)
# construct multiview data
K = array(NA, dim = c(nrow(dat1), ncol(dat1), 3))
K[,,1] = StandardizeKernel(dat1, center = TRUE, scale = TRUE)
K[,,2] = StandardizeKernel(dat2, center = TRUE, scale = TRUE)
K[,,3] = StandardizeKernel(dat3, center = TRUE, scale = TRUE)
# perform multiple kernel k-means
res <- mkkc(K = K, centers = 3)
coef(res) # kernel coefficients of the three views
res$cluster
# perfom multiple kernel k-means with constraint (2 * theta3 <= theta1 and theta3 <= theta 2)
require(Matrix)
myA <- Matrix(c(1, 0, -2,
0, 1, -1), ncol = 3, byrow = TRUE, sparse = TRUE)
mybc <- rbind(blc = c(0, 0), buc = c(Inf, Inf))
res <- mkkc(K = K, centers = 3, A = myA, bc = mybc)
coef(res) # kernel coefficients of the three views
res$cluster
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.