diana2means: 2-Means with Hierarchical Initialization

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/diana2means.R

Description

Split a set of data points into two coherent groups using the k-means algorithm. Instead of random initialization, divisive hierarchical clustering is used to determine initial groups and the corresponding centroids.

Usage

1
2
3
diana2means(mydata, mingroupsize = 5, 
            ngenes = 50, ignore.genes = 5, 
            return.cut = FALSE)

Arguments

mydata

either an expression set as defined by the package Biobase or a matrix of expression levels (rows=genes, columns=samples).

mingroupsize

report only splits where both groups are larger than this size.

ngenes

number of genes used to compute cluster quality DLD-score.

ignore.genes

number of best scoring genes to be ignored when computing DLD-scores.

return.cut

logical, whether to return the attributions of samples to groups.

Details

This function uses divisive hierarchical clustering (diana) to generate a first split of the data. Thereby, each column of the data matrix is considered to represent a data element. From the thus generated temptative groups, centroids are deduced and used to initialize the k-means clustering algorithm.

For the split optimized by k-means the DLD-score is determined using the ngenes and ignore.genes arguments.

Value

If the logical return.cut is set to FALSE (the default), a single number is representing the DLD-score for the generated split is returned. Otherwise an object of class split containing the following elements is returned:

cut

one number out of 0 and 1 per column in the original data, specifying the split attribution.

score

the DLD-score achieved by the split.

Author(s)

Joern Toedling, Claudio Lottaz

See Also

diana

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
# get golub data
library(vsn)
library(golubEsets)
data(Golub_Merge)

# use 10% most variable genes
e <- exprs(Golub_Merge)
vars <- apply(e, 1, var)
e <- e[vars > quantile(vars,0.9),]

# use diana2means to get splits and scores
diana2means(e)
diana2means(e, return.cut=TRUE)

adSplit documentation built on Nov. 8, 2020, 5:40 p.m.