fitmixture | R Documentation |
Estimates parameters of the mixture model using the expectation maximization (EM) algorithm. General form for the cdf of a statistical mixture model is given by
F(x,{Θ}) = ∑_{j=1}^{K}ω_j F_j(x,θ_j),
where Θ=(θ_1,…,θ_K)^T, is the whole parameter vector, θ_j for j=1,…,K is the parameter space of the j-th component, i.e. θ_j=(α_j,β_j)^{T}, F_j(.,θ_j) is the cdf of the j-th component, and known constant K is the number of components. Parameters α and β are the shape and scale parameters or both are the shape parameters. In the latter case, the parameters α and β are called the first and second shape parameters, respectively. We note that the constants ω_js sum to one, i.e. ∑_{j=1}^{K}ω_j=1. The families considered for the cdf F include Birnbaum-Saunders, Burr type XII, Chen, F, Frechet, Gamma, Gompertz, Log-normal, Log-logistic, Lomax, skew-normal, and Weibull.
fitmixture(data, family, K, initial=FALSE, starts)
data |
Vector of observations. |
family |
Name of the family including: " |
K |
Number of components. |
initial |
The sequence of initial values including ω_1,…,ω_K,α_1,…,α_K,β_1,…,β_K. For skew normal case the vector of initial values of skewness parameters will be added. By default the initial values automatically is determind by k-means method of clustering. |
starts |
If |
It is worth noting that identifiability of the mixture models supposed to be held. For skew-normal case we have θ_j=(α_j,β_j,λ_j)^{T} in which -∞<α_j<∞, β_j>0, and -∞<λ_j<∞, respectively, are the location, scale, and skewness parameters of the j-th component, see Azzalini (1985).
The output has three parts, The first part includes vector of estimated weight, shape, and scale parameters.
The second part involves a sequence of goodness-of-fit measures consist of Akaike Information Criterion (AIC
), Consistent Akaike Information Criterion (CAIC
), Bayesian Information Criterion (BIC
), Hannan-Quinn information criterion (HQIC
), Anderson-Darling (AD
), Cram\'eer-von Misses (CVM
), Kolmogorov-Smirnov (KS
), and log-likelihood (log-likelihood
) statistics.
The last part of the output contains clustering vector.
Mahdi Teimouri
A. Azzalini, 1985. A class of distributions which includes the normal ones, Scandinavian Journal of Statistics, 12, 171-178.
A. P. Dempster, N. M. Laird, and D. B. Rubin, 1977. Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society Series B, 39, 1-38.
M. Teimouri, S. Rezakhah, and A. Mohammdpour, 2018. EM algorithm for symmetric stable mixture model, Communications in Statistics-Simulation and Computation, 47(2), 582-604.
# Here we model the northern hardwood uneven-age forest data (HW$DIA) in inches using a # 3-component Weibull mixture distribution. data(HW) data<-HW$DIA K<-3 fitmixture(data,"weibull", K, initial=FALSE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.