Repeated soft clustering for detection of empty clusters for estimation of optimised number of clusters

Share:

Description

This function performs repeated soft clustering for a range of cluster numbers c and reports the number of empty clusters detected.

Usage

1
cselection(eset,m,crange=seq(4,32,4),repeats=5,visu=TRUE,...)

Arguments

eset

object of class ExpressionSet.

m

value of fuzzy c-means parameter m.

crange

range of number of clusters c.

repeats

number of repeated clusterings.

visu

If visu=TRUE plot of number of empty clusters is produced.

...

additional arguments for underlying mfuzz.

Details

A soft cluster is considered as empty, if none of the genes has a corresponding membership value larger than 0.5

Value

A matrix with the number of empty clusters detected is generated.

Note

The cselection function may help to determine an accurate cluster number. However, it should be used with care, as the determination remains difficult especially for short time series and overlapping clusters. A better way is likely to perform clustering with a range of cluster numbers and subsequently assess their biological relevance e.g. by GO analyses.

Author(s)

Matthias E. Futschik (http://www.cbme.ualg.pt/mfutschik_cbme.html)

References

M.E. Futschik and B. Charlisle, Noise robust clustering of gene expression time-course data, Journal of Bioinformatics and Computational Biology, 3 (4), 965-988, 2005

L. Kumar and M. Futschik, Mfuzz: a software package for soft clustering of microarray data, Bioinformation, 2(1) 5-7,2007

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
if (interactive()){
data(yeast)
# Data pre-processing
yeastF <- filter.NA(yeast)
yeastF <- fill.NA(yeastF)
yeastF <- standardise(yeastF)

#### parameter selection
# Empty clusters should not appear
cl <- mfuzz(yeastF,c=20,m=1.25)
mfuzz.plot(yeastF,cl=cl,mfrow=c(4,5))

# Note: The following calculation might take some time

 tmp  <- cselection(yeastF,m=1.25,crange=seq(5,40,5),repeats=5,visu=TRUE)
 # derivation of number of non-empty clusters (crosses) from diagnonal
 # line  indicate appearance of empty clusters 

# Empty clusters might appear 
cl <- mfuzz(yeastF,c=40,m=1.25)
mfuzz.plot(yeastF,cl=cl,mfrow=c(4,5)) 
 }

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.