For a given matrix of annotation information, this function returns the information ordered according to the best fit with the data.
An instance of class
The name of the annotations matrix. Default is
The number of clusters to test. Default is
The minimum number of proteins per component cluster.
The normalisation factor, per
An optional random number generation seed.
As there are typically many protein/annotation sets that may fit the data we order protein sets by best fit i.e. cluster tightness, by computing the mean normalised Euclidean distance for all instances per protein set.
For each protein set i.e. proteins that have been labelled
with a specified term/information criteria, we find the best
k cluster components for the set (the default is to
k = 1:5) according to the minimum mean normalised
pairwise Euclidean distance over all component clusters.
(Note: when testing
k if any components are found to
have less than
n proteins these components are not
k is reduced by 1).
Each component cluster is normalised by
N is the total number of proteins per component,
p is the power). Hueristally,
p = 1/3
and normalising by
N^1/3 has been found the optimum
Candidates in the matrix are ordered according to lowest mean normalised pairwise Euclidean distance as we expect high density, tight clusters to have the smallest mean normalised distance.
This function is a wrapper for running
getNormDist, see the "Annotating spatial proteomics data"
vignette for more details.
MSnSet containing the newly ordered
Lisa M Breckels
addGoAnnotations and example therein.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.