Description Usage Arguments Value Author(s) See Also Examples
cvxclust_path_ama
estimates the convex clustering path via the Alternating Minimization Algorithm.
Required inputs include a data matrix X
(rows are features; columns are samples), a vector of weights
w
, and a sequence of regularization parameters gamma
.
Two penalty norms are currently supported: 1-norm and 2-norm.
AMA is performing proximal gradient ascent on the dual function, and therefore can be accelerated with FISTA.
This speed-up is employed by default.
1 2 | cvxclust_path_ama(X, w, gamma, nu = 1, tol = 0.001, max_iter = 10000,
type = 2, accelerate = TRUE)
|
X |
The data matrix to be clustered. The rows are the features, and the columns are the samples. |
w |
A vector of nonnegative weights. The ith entry |
gamma |
A sequence of regularization parameters. |
nu |
The initial step size parameter when backtracking is applied. Otherwise it is a fixed step size in which case there are no guarantees of convergence if it exceeds |
tol |
The convergence tolerance. |
max_iter |
The maximum number of iterations. |
type |
An integer indicating the norm used: 1 = 1-norm, 2 = 2-norm. |
accelerate |
If |
U
A list of centroid matrices.
V
A list of centroid difference matrices.
Lambda
A list of Lagrange multiplier matrices.
Eric C. Chi, Kenneth Lange
cvxclust_path_admm
for estimating the clustering path with ADMM. kernel_weights
and knn_weights
compute useful weights.
To extract cluster assignments from the clustering path use create_adjacency
and find_clusters
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 | ## Clusterpaths for Mammal Dentition
data(mammals)
X <- as.matrix(mammals[,-1])
X <- t(scale(X,center=TRUE,scale=FALSE))
n <- ncol(X)
## Pick some weights and a sequence of regularization parameters.
k <- 5
phi <- 0.5
w <- kernel_weights(X,phi)
w <- knn_weights(w,k,n)
gamma <- seq(0.0,43, length.out=100)
## Perform clustering
nu <- AMA_step_size(w,n)
sol <- cvxclust_path_ama(X,w,gamma,nu=nu)
## Plot the cluster path
library(ggplot2)
svdX <- svd(X)
pc <- svdX$u[,1:2,drop=FALSE]
pc.df <- as.data.frame(t(pc)%*%X)
nGamma <- sol$nGamma
df.paths <- data.frame(x=c(),y=c(), group=c())
for (j in 1:nGamma) {
pcs <- t(pc)%*%sol$U[[j]]
x <- pcs[1,]
y <- pcs[2,]
df <- data.frame(x=pcs[1,], y=pcs[2,], group=1:n)
df.paths <- rbind(df.paths,df)
}
X_data <- as.data.frame(t(X)%*%pc)
colnames(X_data) <- c("x","y")
X_data$Name <- mammals[,1]
data_plot <- ggplot(data=df.paths,aes(x=x,y=y))
data_plot <- data_plot + geom_path(aes(group=group),colour='grey30',alpha=0.5)
data_plot <- data_plot + geom_text(data=X_data,aes(x=x,y=y,label=Name),
position=position_jitter(h=0.125,w=0.125))
data_plot <- data_plot + geom_point(data=X_data,aes(x=x,y=y),size=1.5)
data_plot <- data_plot + xlab('Principal Component 1') + ylab('Principal Component 2')
data_plot + theme_bw()
## Output Cluster Assignment at 10th gamma
A <- create_adjacency(sol$V[[10]],w,n)
find_clusters(A)
## Visualize Cluster Assignment
G <- graph.adjacency(A, mode = 'upper')
plot(G,vertex.label=as.character(mammals[,1]),vertex.label.cex=0.65,vertex.label.font=2)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.