Description Usage Arguments Value References Examples
The Aggregating Bundles of Clusters (ABC, \insertCiteAmaratunga2008IntClust) was originally developed for a single gene expression data. ABC is an iterative algorithm in which for each iteration a random sample of objects and features is taken of each data set. A clustering algorithm is run on each subset and an incidence matrix $C$ is set up by dividing the resulting dendrogram in $k$ clusters. After $r$ iterations, all incidence matrices are summed and divided by number of times two objects were selected simultaneously. This similarity value is transformed into a dissimilarity measure expressing the number of times the objects are not clustered together when both are selected. The obtained matrix is used a input into a clustering algorithm.
1 2 3 4 |
data |
A data matrix. It is assumed the rows are corresponding with the objects. |
transpose |
Logical, whether the data should be transposed to have the ABC orginal format of rows being the variables and columns the samples. Defaults to TRUE. |
distmeasure |
The distance measurs to be used for the data matrix. Should be one of "tanimoto", "euclidean", "jaccard", "hamming". Defaults to "euclidean". |
weighting |
Logical value indicating whether the rows should be weighted in the resampling. |
stat |
The statistic to be used in weighing the rows. Currently the Coefficient of Variation and Variance are allowed. The corresponding inputs for these should be, "cv" and "var". If the rows are to be weighed equally, any other string will do. |
normalize |
Logical. Indicates whether to normalize the distance matrices or not, default is FALSE. This is recommended if different distance types are used. More details on normalization in |
method |
A method of normalization. Should be one of "Quantile","Fisher-Yates", "standardize","Range" or any of the first letters of these names. Default is NULL. |
gr |
A prespecified grouping of the samples to be used in calculating the F-statistic if stat="F". |
bag |
Logical, indicating whether the columns should be bagged in each iteration. Defaults to TRUE. |
numsim |
The number of iterations to be used in the ABC Algorithm. Default is 1000. |
numvar |
The number of featurus to be used at each iteration to calculate the temporary clusters in the ABC Algorithm. |
linkage |
Choice of inter group dissimilarity (character). Defaults to "ward". |
alpha |
The parameter alpha to be used in the "flexible" linkage of the agnes function. Defaults to 0.625 and is only used if the linkage is set to "flexible" |
NC |
Expected number of clusters in the data; passed to Wards Method in each iteration. Default is NULL. |
NC2 |
Expected number of clusters in the data; passed to Wards Method in the final calculation of the clusters. By default set to NULL such that NC2=NC. If NC2="syl", a silhouette will be used to determine the most likely number of clusters. |
mds |
Logical, indicating whether the dissimilarities calculated in the ABC Algorithm should be plotted using Multi Dimensional Scaling. Defaults to FALSE. |
The returned value is a list of two elements:
DistM |
The resulting distance matrix matrix |
Clust |
The resulting clustering |
The value has class 'Ensemble'.
Amaratunga2008IntClust
1 2 3 4 5 6 7 | data(fingerprintMat)
data(targetMat)
L=list(fingerprintMat,targetMat)
MCF7_ABC=ABC.SingleInMultiple(data=fingerprintMat,transpose=TRUE,distmeasure="tanimoto",
weighting=TRUE,stat="var", gr=c(),bag=TRUE, numsim=100,numvar=100,linkage="flexible",
alpha=0.625,NC=7, NC2=NULL, mds=FALSE)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.