Description Usage Arguments Details References See Also Examples

The FIRES Algorithm follows a three phase framework: In a first phase, base-clusters are generated using a clustering-algorithm on each dimension in isolation. Then these base-clusters are merged in a second phase to find multidimensional cluster-approximations. These approximations are then refined in the third phase.

1 2 3 |

`data` |
A matrix or data frame of input data. |

`base_dbscan_epsilon` |
parameter for the dbscan execution that generates the base clusters |

`base_dbscan_minpts` |
parameter for the dbscan execution that generates the base clusters |

`minimumpercent` |
size a base-cluster must have relative to the average size of base clusters so that it is not discarded |

`k` |
amount of base-clusters that every base-cluster is compared to for merging purposes |

`mu` |
number of most similar clusters in which two clusters must overlap in order to be considered best-merge-clusters of each other |

`minclu` |
number of best-merge-candidates a cluster must have to be considered a best-merge-cluster |

`split` |
a base-cluster is split to merge it with two different clusters iff both clusters resulting from the split have at least this size in proportion to the average size of base-clusters |

`post_dbscan_epsilon` |
parameter for the dbscan execution that turns the cluster-approximations into the clusters that are output at the end |

`post_dbscan_minpts` |
parameter for the dbscan execution that turns the cluster-approximations into the clusters that are output at the end |

In this implementation, the first phase consists of a run of DBSCAN with the
parameters *base_dbscan_epsilon* and *base_dbscan_minpts* on the
objects as they appear from each particular dimension. Then, all of these
base-clusters whose size is smaller than *minimumpercent* of the average
cluster size, e.g. 25
because they are not likely to contain important information for the
clustering.

In the second phase, these base clusters are merged to produce subspace
cluster approximations. This is achieved by computing the
*k*-most-similar clusters for each base-cluster. Then the set of
best-merge-candidates for each base-cluster is determined, which contains
those clusters whose *k*-most-similar clusters overlap the *k*-most
similar clusters of the cluster by at least *mu*. If a cluster has at
least *minclu* best-merge-candidates,it is considered a best-merge
cluster. Finally, every pair of best-merge-clusters that are
best-merge-candidates of each other is grouped together with all of their
best-merge-candidates to form the cluster approximations.

Note that some clusters need to be split and merged with two different other
clusters. This is done before the merging by determining whether the
intersection between a cluster and its most similar cluster as well as the
size of the cluster without this intersection are both larger than
*split* times the average size of the base clusters.

In the third phase, base-clusters are repeatedly removed from
cluster-approximations if their removal improves the amount of objects that
are shared by all base-clusters in the approximation. Finally, to generate the
definitive clusters from the cluster approximation, for each approximation all
base-clusters in the approximation are combined and the a clustering algorithm
is performed on these points. In this implementation, DBSCAN was chosen for
this purpose and will be performed with the parameters
*post_dbscan_epsilon* and *post_dbscan_minpts*.

Hans-Peter Kriegel, Peer Kröger, Matthias Renz and Sebastian Wurst
*A Generic Framework for Efficient Subspace Clustering of
High-Dimensional Data* In Proc. 5th IEEE International Conference on Data
Mining, 2005

Other subspace.clustering.algorithms: `CLIQUE`

;
`P3C`

; `ProClus`

;
`SubClu`

1 2 | ```
data("subspace_dataset")
FIRES(subspace_dataset)
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.