Description Usage Arguments Details Value Author(s) References See Also Examples

Functions `cnmms`

, `cnmpl`

and `cnmap`

can be used to
compute the maximum likelihood estimate of a semiparametric mixture
model that has a one-dimensional mixing parameter. The types of
mixture models that can be computed include finite, nonparametric and
semiparametric ones.

Function `cnmms`

can also be used to compute the maximum
likelihood estimate of a finite or nonparametric mixture model.

1 2 3 4 5 6 7 8 9 | ```
cnmms(x, init=NULL, maxit=1000, model=c("spmle","npmle"), tol=1e-6,
grid=100, kmax=Inf, plot=c("null", "gradient", "probability"),
verbose=0)
cnmpl(x, init=NULL, tol=1e-6, tol.npmle=tol*1e-4, grid=100, maxit=1000,
plot=c("null", "gradient", "probability"), verbose=0)
cnmap(x, init=NULL, maxit=1000, tol=1e-6, grid=100, plot=c("null",
"gradient"), verbose=0)
``` |

`x` |
a data object of some class that can be defined fully by the user |

`init` |
list of user-provided initial values for the mixing
distribution |

`model` |
the type of model that is to estimated: non-parametric
MLE ( |

`maxit` |
maximum number of iterations |

`tol` |
a tolerance value that is used to terminate an
algorithm. Specifically, the algorithm is terminated, if the
relative increase of the log-likelihood value after an iteration is
less than |

`tol.npmle` |
a tolerance value that is used to terminate the computing of the NPMLE internally. |

`grid` |
number of grid points that are used by the algorithm to
locate all the local maxima of the gradient function. A larger
number increases the chance of locating all local maxima, at the
expense of an increased computational cost. The locations of the
grid points are determined by the function |

`kmax` |
upper bound on the number of support points. This is useful for fitting a finite mixture model. |

`plot` |
whether a plot is produced at each iteration. Useful for
monitoring the convergence of the algorithm. If |

`verbose` |
verbosity level for printing intermediate results in each iteration, including none (= 0), the log-likelihood value (= 1), the maximum gradient (= 2), the support points of the mixing distribution (= 3), the mixing proportions (= 4), and if available, the value of the structural parameter beta (= 5). |

A finite mixture model has a density of the form

*f(x; pi, theta, beta) = sum_{j=1}^k pi_j f(x; theta_j, beta),*

where *pi_j >= 0* and *sum_{j=1}^k pi_j =1*.

A nonparametric mixture model has a density of the form

*f(x; G) = Integral f(x; theta) d G(theta),*

where *G* is a mixing distribution that is completely
unspecified. The maximum likelihood estimate of the
nonparametric *G*, or the NPMLE of $*G*, is known to be a
discrete distribution function.

A semiparametric mixture model has a density of the form

*f(x; G, beta) = Int f(x; theta, beta) d G(theta),*

where *G* is a mixing distribution that is completely unspecified and
*beta* is the structural parameter.

Of the three functions, `cnmms`

is recommended for most problems;
see Wang (2010).

Functions `cnmms`

, `cnmpl`

and `cnmap`

implement the
algorithms CNM-MS, CNM-PL and CNM-AP that are described in Wang
(2010). Their implementations are generic using S3 object-oriented
programming, in the sense that they can work for an arbitrary family
of mixture models that is defined by the user. The user, however,
needs to supply the implementations of the following functions for
their self-defined family of mixture models, as they are needed
internally by the functions above:

`initial(x, beta, mix, kmax)`

`valid(x, beta)`

`logd(x, beta, pt, which)`

`gridpoints(x, beta, grid)`

`suppspace(x, beta)`

`length(x)`

`print(x, ...)`

`weights(x, ...)`

While not needed by the algorithms, one may also implement

`plot(x, mix, beta, ...)`

so that the fitted model can be shown graphically in a way that the user desires.

For creating a new class, the user may consult the implementations of
these functions for the families of mixture models included in the
package, e.g., `cvp`

and `mlogit`

.

`family` |
the class of the mixture family that is used to fit to the data. |

`num.iterations` |
Number of iterations required by the algorithm |

`grad` |
For |

`max.gradient` |
Maximum value of the gradient function, evaluated
at the beginning of the final iteration. It is only
given by function |

`convergence` |
convergence code. |

`ll` |
log-likelihood value at convergence |

`mix` |
MLE of the mixing distribution, being an object of the
class |

`beta` |
MLE of the structural parameter |

Yong Wang <[email protected]>

Wang, Y. (2007). On fast computation of the non-parametric maximum
likelihood estimate of a mixing distribution. *Journal of the
Royal Statistical Society, Ser. B*, **69**, 185-198.

Wang, Y. (2010). Maximum likelihood computation for fitting
semiparametric mixture models. *Statistics and Computing*,
**20**, 75-86

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | ```
## Compute the MLE of a finite mixture
x = rnpnorm(200, mu=c(0,4), pr=c(0.7,0.3), sd=1)
for(k in 1:6) plot(cnmms(x, kmax=k), x, add=(k>1), comp="null", col=k+1,
main="Finite Normal Mixtures")
legend("topright", 0.3, leg=paste0("k = ",1:6), lty=1, lwd=2, col=2:7)
## Compute a semiparametric MLE
# Common variance problem
x = rcvps(k=100, ni=5:10, mu=c(0,4), pr=c(0.7,0.3), sd=3)
cnmms(x) # CNM-MS algorithm
cnmpl(x) # CNM-PL algorithm
cnmap(x) # CNM-AP algorithm
# Logistic regression with a random intercept
x = rmlogit(k=100, gi=3:5, ni=6:10, pt=c(0,4), pr=c(0.7,0.3),
beta=c(0,3))
cnmms(x)
### Real-world data
# Random intercept logistic model
data(toxo)
cnmms(mlogit(toxo))
data(betablockers)
cnmms(mlogit(betablockers))
data(lungcancer)
cnmms(mlogit(lungcancer))
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.