RandomForestClustering: Random Forest Clustering

View source: R/RandomForestClustering.R

RandomForestClusteringR Documentation

Random Forest Clustering

Description

Clustering using the proximity matrix of random forest with either PAM or hierarchical clustering algorithms.

Usage

RandomForestClustering(Data,ClusterNo,

Type="ward.D2",NoTrees = 2000,

PlotIt=FALSE,PlotForest=FALSE,...)

Arguments

Data

[1:n,1:d] matrix of dataset to be clustered. It consists of n cases of d-dimensional data points. Every case has d attributes, variables or features

ClusterNo

A number k which defines k different clusters to be built by the algorithm.

Type

Method of cluster analysis: "PAM", "ward.D", "ward.D2", "single", "complete", "average", "mcquitty", "median" or "centroid".

NoTrees

A number of trees used in the forest

PlotIt

Default: FALSE, If TRUE plots the first three dimensions of the dataset with colored three-dimensional data points defined by the clustering stored in Cls

PlotForest

Default: FALSE, If TRUE plots the forest

...

Further arguments to be set for the random forest algorithm, if not set, default arguments are used.

Details

Inspired by [Alhusain/Hafez, 2017].

Value

List of

Cls

[1:n] numerical vector with n numbers defining the classification as the main output of the clustering algorithm. It has k unique numbers representing the arbitrary labels of the clustering.

Object

Object defined by clustering algorithm as the other output of this algorithm

Author(s)

Michael Thrun

References

[Alhusain/Hafez, 2017] Alhusain, L., & Hafez, A. M.: Cluster ensemble based on Random Forests for genetic data, BioData mining, Vol. 10(1), pp. 37. 2017.

Examples

data('Hepta')
#out=RandomForestClustering(Hepta$Data,ClusterNo=7,PlotIt=FALSE)

Mthrun/FCPS documentation built on June 28, 2023, 9:29 a.m.