View source: R/HDDClustering.R
HDDClustering | R Documentation |
HDD clustering is based on the Gaussian Mixture Model and on the idea that the data lives in subspaces with a lower dimension than the dimension of the original space. It uses the EM algorithm to estimate the parameters of the model [Berge et al., 2012].
HDDClustering(Data, ClusterNo, PlotIt=F,...)
Data |
[1:n,1:d] matrix of dataset to be clustered. It consists of n cases of d-dimensional data points. Every case has d attributes, variables or features. |
ClusterNo |
Optional, Numeric indicating either the number of cluster or a vector of 1:k to indicate the maximal expected number of clusters. |
PlotIt |
(optional) Boolean. Default = FALSE = No plotting performed. |
... |
Further arguments to be set for the clustering algorithm, if not set, default arguments are used, see |
HDD clustering maximises the BIC criterion for a range of possible number of cluster up to ClusterNo
. Per default the most general model is used, alternetively the parameter model="ALL"
can be used to evaluate all possible models with BIC [Berge et al., 2012]. If specific properties of Data
are known priorly please see hddc
for specific model selection.
List of
Cls |
[1:n] numerical vector with n numbers defining the classification as the main output of the clustering algorithm. It has k unique numbers representing the arbitrary labels of the clustering. |
Object |
Object defined by clustering algorithm as the other output of this algorithm |
Quirin Stier
[Berge et al., 2012] L. Berge, C. Bouveyron and S. Girard, HDclassif: an R Package for Model-Based Clustering and Discriminant Analysis of High-Dimensional Data, Journal of Statistical Software, vol. 42 (6), pp. 1-29, 2012.
[Bouveyron et al., 2007] Bouveyron, C. Girard, S. and Schmid, C: High-Dimensional Data Clustering, Computational Statistics and Data Analysis, vol. 52 (1), pp. 502-519, 2007.
# Hepta
data("Hepta")
Data = Hepta$Data
#Non-default parameter model
#can be set to evaulate all possible models
V = HDDClustering(Data=Data,ClusterNo=7,model="ALL")
Cls = V$Cls
ClusterAccuracy(Hepta$Cls, Cls)
## Not run:
library(HDclassif)
data(Crabs)
Data = Crabs[,-1]
V = HDDClustering(Data=Data,ClusterNo=4,com_dim=1)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.