RobMM: RobMM
In RGMM: Robust Mixture Model

View source: R/Final_functions.R

RobMM

R Documentation

RobMM

Description

Robust Mixture Model

Usage

RobMM(X, nclust=2:5, model="Gaussian", ninit=10,
               nitermax=50, niterEM=50, niterMC=50, df=3,
               epsvp=10^(-4), mc_sample_size=1000, LogLike=-Inf,
               init='genie', epsPi=10^-4, epsout=-20,scale='none',
               alpha=0.75, c=ncol(X), w=2, epsilon=10^(-8),
               criterion='BIC',methodMC="RobbinsMC", par=TRUE,
               methodMCM="Weiszfeld")

Arguments

`X`	A matrix giving the data.
`nclust`	A vector of positive integers giving the possible number of clusters.
`model`	The mixture model. Can be `'Gaussian'` (by default), `'Student'` and `'Laplace'`.
`ninit`	The number of random initisalizations. Befault is `10`.
`nitermax`	The number of iterations for the Weiszfeld algorithm if `MethodMCM= 'Weiszfeld'`.
`niterEM`	The number of iterations for the EM algorithm.
`niterMC`	The number of iterations for estimating robustly the variance of each class if `methodMC='FixMC'` or `methodMC='GradMC'`.
`df`	The degrees of freedom for the Student law if `model='Student'`.
`scale`	Run the algorithm on scaled data if `scale='robust'`.
`epsvp`	The minimum values the estimates of the eigenvalues of the Median Covariation Matrix can take. Default is `10^-4`.
`mc_sample_size`	The number of data generated for the Monte-Carlo method for estimating robustly the variance.
`LogLike`	The initial loglikelihood to "beat". Defulat is `-Inf`.
`init`	Can be `F` if no non random initialization of the algorithm is done, `'genie'` if the algorithm is initialized with the help of the function `'genie'` of the package `genieclust` or `'Mclust'` if the initialization is done with the function `hclass` of the package `Mclust`.
`epsPi`	A scalar to ensure the estimates of the probabilities of belonging to a class or uniformly lower bounded by a positive constant.
`epsout`	If the probability of belonging of a data to a class is smaller than `exp(epsout)`, this probbility is replaced by `exp(epsout)` for calculating the logLikelihood. If the probability is too weak for each class, the data is considered as an outlier. Defautl is `-20`.
`alpha`	A scalar between 1/2 and 1 used in the stepsequence for the Robbins-Monro method if `methodMC='RobbinsMC'`.
`c`	The constant in the stepsequence if `methodMC='RobbinsMC'` or `methodMC='GradMC'`.
`w`	The power for the weighted averaged Robbins-Monro algorithm if `methodMC='RobbinsMC'`.
`epsilon`	Stoping condition for the Weiszfeld algorithm.
`criterion`	The criterion for selecting the number of cluster. Can be `'ICL'` (default) or `'BIC'`.
`methodMC`	The method chosen to estimate robustly the variance. Can be `'RobbinsMC'`, `'GradMC'` or `'FixMC'`.
`par`	Is equal to `T` if the parallelization of the algorithm is allowed.
`methodMCM`	The method chosen for estimating the Median Covariation Matrix. Can be `'Gmedian'` or `'Weiszfeld'`

Value

A list with:

`bestresult`	A list giving all the results fo the best clustering (chosen with respect to the selected criterion.
`allresults`	A list containing all the results.
`ICL`	The ICL criterion for all the number of classes selected.
`BIC`	The ICL criterion for all the number of classes selected.
`data`	The initial data.
`nclust`	A vector of positive integers giving the possible number of clusters.
`Kopt`	The number of clusters chosen by the selected criterion.

For the lists bestresult and allresults[[k]]:

`centers`	A matrix whose rows are the centers of the classes.
`Sigma`	A matrix containing all the variance of the classes
`LogLike`	The final LogLikelihood.
`Pi`	A matrix giving the probabilities of each data to belong to each class.
`niter`	The number of iterations of the EM algorithm.
`initEM`	A vector giving the initialized clustering if `init='Mclust'` or `init='genie'`.
`prop`	A vector giving the proportions of each classes.
`outliers`	A vector giving the detected outliers.

References

Cardot, H., Cenac, P. and Zitt, P-A. (2013). Efficient and fast estimation of the geometric median in Hilbert spaces with an averaged stochastic gradient algorithm. Bernoulli, 19, 18-43.

Cardot, H. and Godichon-Baggioni, A. (2017). Fast Estimation of the Median Covariation Matrix with Application to Online Robust Principal Components Analysis. Test, 26(3), 461-480

Vardi, Y. and Zhang, C.-H. (2000). The multivariate L1-median and associated data depth. Proc. Natl. Acad. Sci. USA, 97(4):1423-1426.

Examples

## Not run: 
ech <- Gen_MM(mu = matrix(c(rep(-2,3),rep(2,3),rep(0,3)),byrow = TRUE,nrow=3))
 X <- ech$X
 res <- RobMM(X , nclust=3)
 RMMplot(res,graph=c('Two_Dim'))
 
## End(Not run)

RGMM documentation built on Nov. 24, 2023, 5:10 p.m.