IClust: IClust (Imblanaced Clustering)

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/IClust.R

Description

IClust is the function which discovers clusters that are of highly different (imbalanced) sizes. First, the initial clusters are found by using an existing clustering method. Then, the merging procedure is applied in order to merge two close clusters. The merging procedure employs Local Outlier Factor (Breunig et.al, 2000) for assessing if two clusters can be merged.

Usage

1
IClust(data, method = NULL, k.init = NULL, cv = NULL, q.max = NULL)

Arguments

data

A data matrix of standardized values.

method

An existing clustering method used to create a set of initial clusters which are then merged. The default is "ward" (Ward's hierarchical clustering), other options are "kmeans", "pam", "complete","mclust".

k.init

The number of initial clusters to find, using an initial (existing) clustering method. The default is 10*log(nrow(data)).

cv

The type of the critical value for merging two close clusters. The value is determined based on scores of Local Oultier Factor (LOF).

q.max

The maximal number of the nearest neighbors to calculate LOF for following merging.

Details

The mergining procedure incorporating LOF is used to evaluate whether or not two close clusters share the same local density - are from the same group. If so, the two clusters are merged. The procedure is applied untill no two cluster can be merged, see references.

Value

A resulting cluster membership for each observation.

Author(s)

Sarka Brodinova <sarka.brodinova@tuwien.ac.at>

References

S. Brodinova, M. Zaharieva, P. Filzmoser, T. Ortner, C. Breiteneder. (2017). Clustering of imbalanced high-dimensional media data. Advances in Data Analysis and Classification. To appear. Available at http://arxiv.org/abs/1709.10330.

Breunig, M., Kriegel, H., Ng, R., and Sander, J. (2000). LOF: identifying density-based local outliers. In ACM Int. Conf. on Management of Data, pages 93-104.

See Also

ExampleData

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
data(ExampleData)
res <- IClust(data)
table(label,res)

hc <- hclust(dist(data),method="ward")
cl <- cutree(hc,k=4)
table(label,cl)


data('ExampleData')
res <- IClust(data)
table(label,res)

brodsa/IClust documentation built on April 7, 2020, 10:41 a.m.