fitmixturegrouped: Estimating parameters of the well-known mixture models fitted...
In ForestFit: Statistical Modelling for Plant Size Distributions

fitmixturegrouped

R Documentation

Estimating parameters of the well-known mixture models fitted to the grouped data

Description

Estimates parameters of the gamma, log-normal, and Weibull mixture models fitted to the grouped data using the expectation maximization (EM) algorithm. General form for the cdf of a statistical mixture model is given by

F(x,{\Theta}) = \sum_{k=1}^{K}\omega_k F_k(x,\theta_k),

where \Theta=(\theta_1,\dots,\theta_K)^T, is the whole parameter vector, \theta_k for k=1,\dots,K is the parameter space of the j-th component, i.e. \theta_k=(\alpha_k,\beta_k)^{T}, F_j(.,\theta_j) is the cdf of the k-th component, and known constant K is the number of components. Parameters \alpha and \beta are the shape and scale parameters. The constants \omega_ks sum to one, i.e. \sum_{k=1}^{K}\omega_k=1. The families considered for the cdf F include Gamma, Log-normal, and Weibull. If a sample of n independent observations each follows a distribution with cdf F have been divided into m separate groups of the form (r_{i-1},r_i], for i=1,\dots,m. So, the likelihood function of the observed data is given by

L(\Theta|f_1,\dots,f_m)=\frac{n!}{f_{1}!f_{2}!\dots f_{m}!}\prod_{i=1}^{m}\Bigl[\frac{F_i(\Theta)}{F(\Theta)}\Bigr]^{f_i},

where

F_i(\Theta)=\sum_{k=1}^{K}\omega_k\int_{r_{i-1}}^{r_i}f(x|\theta_k)dx,

F(\Theta)=\sum_{k=1}^{K}\omega_kf(x|\theta_k)dx,

in which f(x|\theta_k) denotes the pdf of the j-th component. Using the the EM algorithm proposed by Dempster et al. (1977), we can solve \partial L(\Theta|f_1,\dots,f_m)/{\partial \Theta}=0 by introducing two new missing variables.

Usage

fitmixturegrouped(family, r, f, K, initial=FALSE, starts)

Arguments

`family`	Name of the family including: "`gamma`", "`log-normal`", "`skew-normal`", and "`weibull`".
`r`	A numeric vector of length `m+1`. The first element of `r` is lower bound of the first group and other `m` elements are upper bound of the `m` groups. We note that upper bound of the `(i-1)`-th group is the lower bound of the `i`-th group, for `i=2,\dots,m`. The lower bound of the first group and upper bound of the `m`-th group are chosen arbitrarily. If raw data are available, the smallest and largest observations are chosen for lower bound of the first group and upper bound of the `m`-th group, respectively.
`f`	A numeric vector of length `m` containing the group's frequency.
`K`	Number of components.
`initial`	The sequence of initial values including `\omega_1,\dots,\omega_K,\alpha_1,\dots,\alpha_K,\beta_1,\dots,\beta_K`. For skew normal case the vector of initial values of skewness parameters will be added. By default the initial values automatically is determind by k-means method of clustering.
`starts`	If `initial=TRUE`, then sequence of the initial values must be given.

Details

Identifiability of the mixture models supposed to be held. For skew-normal mixture model the parameter vector of k-th component gets the form \theta_k=(\alpha_k,\beta_k,\lambda_k)^{T} where \alpha_k,\beta_k, and \lambda_k denote the location, scale, and skewness parameters, respectively.

Value

The output has two parts, The first part includes vector of estimated weight, shape, and scale parameters.
A sequence of goodness-of-fit measures consist of Akaike Information Criterion (AIC), Consistent Akaike Information Criterion (CAIC), Bayesian Information Criterion (BIC), Hannan-Quinn information criterion (HQIC), Anderson-Darling (AD), Cramer-von Mises (CVM), Kolmogorov-Smirnov (KS), and log-likelihood (log-likelihood) statistics.

Author(s)

Mahdi Teimouri

References

G. J. McLachlan and P. N. Jones, 1988. Fitting mixture models to grouped and truncated data via the EM algorithm, Biometrics, 44, 571-578

Examples

n<-50
K<-2
m<-10
weight<-c(0.3,0.7)
alpha<-c(1,2)
beta<-c(2,1)
param<-c(weight,alpha,beta)
data<-rmixture(n, "weibull", K, param)
r<-seq(min(data),max(data),length=m+1)
D<-data.frame(table(cut(data,r,labels=NULL,include.lowest=TRUE,right=FALSE,dig.lab=4)))
f<-D$Freq
fitmixturegrouped("weibull",r,f,K,initial=FALSE)

ForestFit documentation built on April 3, 2025, 5:27 p.m.