cvxclust: Convex Clustering Path via Variable Splitting Methods
In cvxclustr: Splitting methods for convex clustering

Description Usage Arguments Value Author(s) See Also Examples

cvxclust estimates the convex clustering path via variable splitting methods: ADMM and AMA. This function is a wrapper function that calls either cvxclust_path_admm or cvxclust_path_ama (the default) to perform the computation. Required inputs include a data matrix X (rows are features; columns are samples), a vector of weights w, and a sequence of regularization parameters gamma. Two penalty norms are currently supported: 1-norm and 2-norm. Both ADMM and AMA admit acceleration schemes at little additional computation. Acceleration is turned on by default.

1 2	cvxclust(X, w, gamma, method = "ama", nu = 1, tol = 0.001, max_iter = 10000, type = 2, accelerate = TRUE)

`X`	The data matrix to be clustered. The rows are the features, and the columns are the samples.
`w`	A vector of nonnegative weights. The ith entry `w[i]` denotes the weight used between the ith pair of centroids. The weights are in dictionary order.
`method`	Algorithm to use: "admm" or "ama"
`gamma`	A sequence of regularization parameters.
`nu`	A positive penalty parameter for quadratic deviation term.
`tol`	The convergence tolerance.
`max_iter`	The maximum number of iterations.
`type`	An integer indicating the norm used: 1 = 1-norm, 2 = 2-norm.
`accelerate`	If `TRUE` (the default), acceleration is turned on.

U A list of centroid matrices.

V A list of centroid difference matrices.

Lambda A list of Lagrange multiplier matrices.

Eric C. Chi, Kenneth Lange

cvxclust_path_ama and cvxclust_path_admm for estimating the clustering path with AMA or ADMM. kernel_weights and knn_weights compute useful weights. To extract cluster assignments from the clustering path use create_adjacency and find_clusters.

## Clusterpaths for Mammal Dentition
data(mammals)
X <- as.matrix(mammals[,-1])
X <- t(scale(X,center=TRUE,scale=FALSE))
n <- ncol(X)

## Pick some weights and a sequence of regularization parameters.
k <- 5
phi <- 0.5
w <- kernel_weights(X,phi)
w <- knn_weights(w,k,n)
gamma <- seq(0.0,43, length.out=100)

## Perform clustering
sol <- cvxclust(X,w,gamma)

## Plot the cluster path
library(ggplot2)
svdX <- svd(X)
pc <- svdX$u[,1:2,drop=FALSE]
pc.df <- as.data.frame(t(pc)%*%X)
nGamma <- sol$nGamma
df.paths <- data.frame(x=c(),y=c(), group=c())
for (j in 1:nGamma) {
  pcs <- t(pc)%*%sol$U[[j]]
  x <- pcs[1,]
  y <- pcs[2,]
  df <- data.frame(x=pcs[1,], y=pcs[2,], group=1:n)
  df.paths <- rbind(df.paths,df)
}
X_data <- as.data.frame(t(X)%*%pc)
colnames(X_data) <- c("x","y")
X_data$Name <- mammals[,1]
data_plot <- ggplot(data=df.paths,aes(x=x,y=y))
data_plot <- data_plot + geom_path(aes(group=group),colour='grey30',alpha=0.5)
data_plot <- data_plot + geom_text(data=X_data,aes(x=x,y=y,label=Name),
  position=position_jitter(h=0.125,w=0.125))
data_plot <- data_plot + geom_point(data=X_data,aes(x=x,y=y),size=1.5)
data_plot <- data_plot + xlab('Principal Component 1') + ylab('Principal Component 2')
data_plot + theme_bw()

## Output Cluster Assignment at 10th gamma
A <- create_adjacency(sol$V[[10]],w,n)
find_clusters(A)

## Visualize Cluster Assignment
G <- graph.adjacency(A, mode = 'upper')
plot(G,vertex.label=as.character(mammals[,1]),vertex.label.cex=0.65,vertex.label.font=2)

Loading required package: Matrix
Loading required package: igraph

Attaching package: 'igraph'

The following objects are masked from 'package:stats':

    decompose, spectrum

The following object is masked from 'package:base':

    union

Warning message:
In cvxclust_ama(X, Lambda, ix - 1, M1 - 1, M2 - 1, s1, s2, w[w >  :
  The stepsize nu may be too large. Setting it to maximum of 1/n and 1/AnM.
$cluster
 [1]  1  2  3  1  1  1  1  1  4  5  5  6  6  4  7  2  2  2  2  2  2  2  2  8  9
[26] 10 10

$size
 [1] 6 9 1 2 2 2 1 1 1 2