fitmixture | R Documentation |
Estimates parameters of the mixture model using the expectation maximization (EM) algorithm. General form for the cdf of a statistical mixture model is given by
F(x,{\Theta}) = \sum_{j=1}^{K}\omega_j F_j(x,\theta_j),
where \Theta=(\theta_1,\dots,\theta_K)^T
, is the whole parameter vector, \theta_j
for j=1,\dots,K
is the parameter space of the j
-th component, i.e. \theta_j=(\alpha_j,\beta_j)^{T}
, F_j(.,\theta_j)
is the cdf of the j
-th component, and known constant K
is the number of components. Parameters \alpha
and \beta
are the shape and scale parameters or both are the shape parameters. In the latter case, the parameters \alpha
and \beta
are called the first and second shape parameters, respectively. We note that the constants \omega_j
s sum to one, i.e. \sum_{j=1}^{K}\omega_j=1
. The families considered for the cdf F
include Birnbaum-Saunders, Burr type XII, Chen, F, Frechet, Gamma, Gompertz, Log-normal, Log-logistic, Lomax, skew-normal, and Weibull.
fitmixture(data, family, K, initial=FALSE, starts)
data |
Vector of observations. |
family |
Name of the family including: " |
K |
Number of components. |
initial |
The sequence of initial values including |
starts |
If |
It is worth noting that identifiability of the mixture models supposed to be held. For skew-normal case we have \theta_j=(\alpha_j,\beta_j,\lambda_j)^{T}
in which -\infty<\alpha_j<\infty
, \beta_j>0
, and -\infty<\lambda_j<\infty
, respectively, are the location, scale, and skewness parameters of the j
-th component, see Azzalini (1985).
The output has three parts, The first part includes vector of estimated weight, shape, and scale parameters.
The second part involves a sequence of goodness-of-fit measures consist of Akaike Information Criterion (AIC
), Consistent Akaike Information Criterion (CAIC
), Bayesian Information Criterion (BIC
), Hannan-Quinn information criterion (HQIC
), Anderson-Darling (AD
), Cramer-von Mises (CVM
), Kolmogorov-Smirnov (KS
), and log-likelihood (log-likelihood
) statistics.
The last part of the output contains clustering vector.
Mahdi Teimouri
A. Azzalini, 1985. A class of distributions which includes the normal ones, Scandinavian Journal of Statistics, 12, 171-178.
A. P. Dempster, N. M. Laird, and D. B. Rubin, 1977. Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society Series B, 39, 1-38.
M. Teimouri, S. Rezakhah, and A. Mohammdpour, 2018. EM algorithm for symmetric stable mixture model, Communications in Statistics-Simulation and Computation, 47(2), 582-604.
# Here we model the northern hardwood uneven-age forest data (HW$DIA) in inches using a
# 3-component Weibull mixture distribution.
data(HW)
data<-HW$DIA
K<-3
fitmixture(data,"weibull", K, initial=FALSE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.