mkkc: Multiple Kernel K-means Clustering

Description Usage Arguments Details Value References See Also Examples

View source: R/mkkc.R

Description

Performs multiple kernel K-means clustering on a multi-view data.

Usage

1
2
3
4
5
mkkc(K, centers, iter.max = 10, A = NULL, bc = NULL,
  epsilon = 1e-04, theta = rep(1/dim(K)[3], dim(K)[3]))

## S3 method for class 'MultipleKernelKmeans'
print(x, ...)

Arguments

K

N x N x P array containing P kernel matrices with size N x N.

centers

The number of clusters, say k.

iter.max

The maximum number of iterations allowed. The default is 10.

epsilon

Convergence threshold. The default is 10^{-4}.

theta

intial values for kernel coefficients. The default is 1/P for all views.

x

Object of class inheriting from mkkc.

...

Additional arguments passed to print.

Details

The optimization problem is described with an array of the multiple kernels (K), the number of clusters (centers), and the linear constraint on the kernel coefficient (θ).

Our method put more weight on views having weak signal for cluster information so that we can utilize important, complementary information collected from all the views. That is, the larger unexplained variance in a view is, the larger kernel coefficient θ on the view will be assigned. After combining multiple views, it minimizes un-explained variance in the combined view by optimizing continuous cluster assignments. Discrete cluster assignments are recovered by performing K-means clustering on the (normalized) continuous cluster assignments. The K-means algorithm is performed with 1000 random starts and the best result minimizing the objective function is reported.

We recommend to standardize all the original features to have zero-mean and unit- variance before processing with kernel functions. The kernel matrices should be constructed ahead of the algorithm. We recommend to normalized the kernels using StandardizeKernel which makes the multiple views are comparable to each other.

Value

mkkc returns an object of class "MultipleKernelKmeans" which has a print and a coef method. It is a list with at least the following components:

cluster

A vector of integers (from 1:k) indicating the cluster to which each point is allocated.

totss

The total sum of squares.

withinss

Matrix of within-cluster sum of squares by cluster, one row per view.

withinsscluster

Vector of within-cluster sum of squares, one component per cluster.

withinssview

Vector of within-cluster sum of squares, one component per view.

tot.withinss

Total within-cluster sum of squares, i.e. sum(withinsscluster).

betweenssview

Vector of between-cluster sum of squares, one component per view.

tot.betweenss

The between-cluster sum of squares, i.e. totss-tot.withinss.

clustercount

The number of clusters, say k.

coefficients

The kernel coefficients

H

The continuous clustering assignment

size

The number of points, one component per cluster.

iter

The number of iterations.

References

\insertRef

bang2018mkkcMKKC

See Also

kernelMatrix, StandardizeKernel

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
require(kernlab)

# define kernel
rbf <- rbfdot(sigma = 0.5)

# construct kernel matrices
n.noise <- 3
dat1 <- kernelMatrix(rbf, simCnoise$view1[,1:(2 + n.noise)])
dat2 <- kernelMatrix(rbf, simCnoise$view2)
dat3 <- kernelMatrix(rbf, simCnoise$view3)

# construct multiview data
K = array(NA, dim = c(nrow(dat1), ncol(dat1), 3))
K[,,1] = StandardizeKernel(dat1, center = TRUE, scale = TRUE)
K[,,2] = StandardizeKernel(dat2, center = TRUE, scale = TRUE)
K[,,3] = StandardizeKernel(dat3, center = TRUE, scale = TRUE)

# perform multiple kernel k-means
res <- mkkc(K = K, centers = 3)

coef(res) # kernel coefficients of the three views
res$cluster

# perfom multiple kernel k-means with constraint (2 * theta3 <= theta1 and theta3 <= theta 2)
require(Matrix)
myA <- Matrix(c(1, 0, -2,
                0, 1, -1), ncol = 3, byrow = TRUE, sparse = TRUE)
mybc <- rbind(blc = c(0, 0), buc =  c(Inf, Inf))
res <- mkkc(K = K, centers = 3, A = myA, bc = mybc)

coef(res) # kernel coefficients of the three views
res$cluster

SeojinBang/MKKC documentation built on Sept. 18, 2019, 1:42 p.m.