SNF: Similarity network fusion
In IntClust: Integration of Multiple Data Sets with Clustering Techniques

Description Usage Arguments Details Value References Examples

Similarity Network Fusion (SNF) is a similarity-based multi-source clustering technique. SNF consists of two steps. In the initial step a similarity network is set up for each data matrix. The network is the visualization of the similarity matrix as a weighted graph with the objects as vertices and the pairwise similarities as weights on the edges. In the network-fusion step, each network is iteratively updated with information of the other network which results in more alike networks every time. This eventually converges to a single network.

SNF(List, type = c("data", "dist", "clusters"), distmeasure = c("tanimoto",
  "tanimoto"), normalize = c(FALSE, FALSE), method = c(NULL, NULL),
  StopRange = FALSE, NN = 20, mu = 0.5, T = 20, clust = "agnes",
  linkage = "ward", alpha = 0.625)

`List`	A list of data matrices of the same type. It is assumed the rows are corresponding with the objects.
`type`	indicates whether the provided matrices in "List" are either data matrices, distance matrices or clustering results obtained from the data. If type="dist" the calculation of the distance matrices is skipped and if type="clusters" the single source clustering is skipped. Type should be one of "data", "dist" or "clusters".
`distmeasure`	A vector of the distance measures to be used on each data matrix. Should be one of "tanimoto", "euclidean", "jaccard", "hamming". Defaults to c("tanimoto","tanimoto").
`normalize`	Logical. Indicates whether to normalize the distance matrices or not, defaults to c(FALSE, FALSE) for two data sets. This is recommended if different distance types are used. More details on normalization in `Normalization`.
`method`	A method of normalization. Should be one of "Quantile","Fisher-Yates", "standardize","Range" or any of the first letters of these names. Default is c(NULL,NULL) for two data sets.
`StopRange`	Logical. Indicates whether the distance matrices with values not between zero and one should be standardized to have so. If FALSE the range normalization is performed. See `Normalization`. If TRUE, the distance matrices are not changed. This is recommended if different types of data are used such that these are comparable. Default is FALSE.
`NN`	The number of neighbours to be used in the procedure. Defaults to 20.
`mu`	The parameter epsilon. The value is recommended to be between 0.3 and 0.8. Defaults to 0.5.
`T`	The number of iterations.
`clust`	Choice of clustering function (character). Defaults to "agnes".
`linkage`	Choice of inter group dissimilarity (character) for the final clustering. Defaults to "ward".
`alpha`	The parameter alpha to be used in the "flexible" linkage of the agnes function. Defaults to 0.625 and is only used if the linkage is set to "flexible"

If r is specified and nrclusters is a fixed number, only a random sampling of the features will be performed for the t iterations (ADECa). If r is NULL and the nrclusters is a sequence, the clustering is performedon all features and the dendrogam is divided into clusters for the values of nrclusters (ADECb). If both r is specified and nrclusters is a sequence, the combination is performed (ADECc). After every iteration, either be random sampling, multiple divisions of the dendrogram or both, an incidence matrix is set up. All incidence matrices are summed and represent the distance matrix on which a final clustering is performed.

The returned value is a list with the following three elements.

`FusedM`	The fused similarity matrix
`DistM`	The distance matrix computed by subtracting FusedM from one
`Clust`	The resulting clustering

The value has class 'SNF'.

\insertRef

Wang2014aIntClust

data(fingerprintMat)
data(targetMat)
L=list(fingerprintMat,targetMat)
MCF7_SNF=SNF(List=L,type="data",distmeasure=c("tanimoto","tanimoto"),normalize=
c(FALSE,FALSE),method=c(NULL,NULL),StopRange=FALSE,NN=10,mu=0.5,T=20,clust="agnes",
linkage="ward",alpha=0.625)