Description Usage Arguments Details Value Author(s) References Examples

This function will output K clusters of the columns of Y using the help of X.

1 2 |

`Y` |
is a n x p matrix of p variables and n observations. The columns of Y will be clustered into K groups. |

`X` |
is a n x q matrix of q variables and n observations. |

`K` |
is the number of clusters. |

`B` |
is the number of iterations in the simulated annealing algorithm. |

`L` |
is the temperature coefficient in the simulated annealing algorithm. |

`alpha` |
is the coefficient of the elastic net penalty. |

`nlambdas` |
is the number of tuning parameters in the elastic net. |

`sampling` |
if 'equal' then the sampling probabilities is the same during the simulated annealing algorithm, if 'size' the probabilites are proportional the the sizes of the clusters in the current iterations. |

`ncv` |
is the number of cross-validations in the elastic net. |

`dist` |
is the type of distance metric for the construction of the similarity matrix. Options are 'gaussian', 'euclidean' and 'correlation', the latter being the default. |

`sigma` |
is the parameter for the gaussian kernel distance which is ignored if 'gaussian' is not chosen as distance measure. |

The algorithm minimizes a modified version of NCut through simulated annealing.
The modified NCut uses in the numerator the similarity matrix of the original data `Y`

and the denominator uses the similarity matrix of the prediction of `Y`

using `X`

.
The clusters correspond to partitions that minimize this objective function.
The external information of `X`

is incorporated by using elastic net to predict `Y`

.

A list with the final value of the objective function, the clusters and the lambda penalty chosen through cross-validation.

A list with the following components:

- loss
a vector of length

`N`

which contains the loss at each iteration of the simulated annealing algorithm.- cluster
a matrix representing the clustering result of dimension

`p`

times`K`

, where`p`

is the number of columns of`Y`

.- lambda.min
is the optimal lambda chosen through cross-validation for the elastic net for predicting

`Y`

with`Y`

.

Sebastian Jose Teran Hidalgo and Shuangge Ma. Maintainer: Sebastian Jose Teran Hidalgo. sebastianteranhidalgo@gmail.com.

Hidalgo, Sebastian J. Teran, Mengyun Wu, and Shuangge Ma. Assisted clustering of gene expression data using ANCut. BMC genomics 18.1 (2017): 623.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 | ```
#This sets up the initial parameters for the simulation.
library(MASS)#for mvrnorm
library(fields)
n=30 #Sample size
B=50 #Number of iterations in the simulated annealing algorithm.
L=10000 #Temperature coefficient.
p=50 #Number of columns of Y.
q=p #Number of columns of X.
h1=0.15
h2=0.25
S=matrix(0.2,q,q)
S[1:(q/2),(q/2+1):q]=0
S[(q/2+1):q,1:(q/2)]=0
S=S-diag(diag(S))+diag(q)
mu=rep(0,q)
W0=matrix(1,p,p)
W0[1:(p/2),1:(p/2)]=0
W0[(p/2+1):p,(p/2+1):p]=0
Denum=sum(W0)
B2=matrix(0,q,p)
for (i in 1:(p/2)){
B2[1:(q/2),i]=runif(q/2,h1,h2)
in1=sample.int(q/2,6)
B2[-in1,i]=0
}
for (i in (p/2+1):p){
B2[(q/2+1):q,i]=runif(q/2,h1,h2)
in2=sample(seq(q/2+1,q),6)
B2[-in2,i]=0
}
X=mvrnorm(n, mu, S)
Z=X%*%B2
Y=Z+matrix(rnorm(n*p,0,1),n,p)
#Our method
Res=ancut(Y=Y,X=X,B=B,L=L,alpha=0,ncv=3)
Cx=Res[[2]]
f11=matrix(Cx[,1],p,1)
f12=matrix(Cx[,2],p,1)
errorL=sum((f11%*%t(f11))*W0)/Denum+sum((f12%*%t(f12))*W0)/Denum
#This is the true error of the clustering solution.
errorL
par(mfrow=c(1,2))
#Below is a plot of the simulated annealing path.
plot(Res[[1]],type='l',ylab='')
#Cluster found by ANCut
image.plot(Cx)
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.