npmle: Maximum Likelihood Estimate of a Mixing Distribution.
In wiscstatman/rvalues: R-Values for Ranking in High-Dimensional Settings

npmle

R Documentation

Maximum Likelihood Estimate of a Mixing Distribution.

Description

Estimates the mixture distribution nonparametrically using an EM algorithm. The estimate is discrete with the results being returned as a vector of support points and a vector of associated mixture probabilities. The available choices for the sampling distribution include: Normal, Poisson, Binomial and t-distributions.

Usage

npmle(data, family = gaussian, maxiter = 500, tol = 1e-4,
      smooth = TRUE, bass = 0, nmix = NULL)

Arguments

`data`	A data frame or a matrix with the number of rows equal to the number of sampling units. The first column should contain the main estimates, and the second column should contain the nuisance terms.
`family`	family determining the sampling distribution (see family)
`maxiter`	the maximum number of EM iterations
`tol`	the convergence tolerance
`smooth`	logical; whether or not to smooth the estimated cdf
`bass`	controls the smoothness level; only relevant if `smooth=TRUE`. Values of up to 10 indicate increasing smoothness.
`nmix`	optional; the number of mixture components

Details

Assuming the following two-level sampling model X_i|θ_i ~ p(x|θ_i,η_i) and θ_i ~ F for i = 1,...,n. The function npmle seeks to find an estimate of the mixing distribution F which maximizes the marginal log-likelihood

l(F) = ∑_i \int p( X_i |θ, η_i) dF(θ).

The distribution function maximizing l(F) is known to be discrete; and thus, the estimated mixture distribution is returned as a set of support points and associated mixture probabilities.

Value

An object of class npmix which is a list containing at least the following components

`support`	a vector of estimated support points
`mix.prop`	a vector of estimated mixture proportions
`Fhat`	a function; obtained through interpolation of the estimated discrete cdf
`fhat`	a function; estimate of the mixture density
`loglik`	value of the log-likelihood at each iteration
`convergence`	0 indicates convergence; 1 indicates that convergence was not achieved
`numiter`	the number of EM iterations required

Author(s)

Nicholas Henderson and Michael Newton

References

Laird, N.M. (1978), Nonparametric maximum likelihood estimation of a mixing distribution, Journal of the American Statistical Association, 73, 805–811.

Lindsay, B.G. (1983), The geometry of mixture likelihoods: a general theory. The Annals of Statistics, 11, 86–94

Examples

## Not run: 
data(hiv)
npobj <- npmle(hiv, family = tdist(df=6), maxiter = 25)


###  Generate Binomial data with Beta mixing distribution
n <- 3000
theta <- rbeta(n, shape1 = 2, shape2 = 10)
ntrials <- rpois(n, lambda = 10)
x <- rbinom(n, size = ntrials, prob = theta)

###  Estimate mixing distribution 
dd <- cbind(x,ntrials)
npest <- npmle(dd, family = binomial, maxiter = 25)

### compare with true mixture cdf
tt <- seq(1e-4,1 - 1e-4, by = .001)
plot(npest, lwd = 2)
lines(tt, pbeta(tt, shape1 = 2, shape2 = 10), lwd = 2, lty = 2)

## End(Not run)

wiscstatman/rvalues documentation built on May 22, 2022, 2:41 a.m.