paramHankel: Estimate a Mixture's Complexity (and Component...
In anjaweigel/mixComp_package: Estimation of the Order of Finite Mixture Distributions

Description Usage Arguments Details Value See Also Examples

Estimation of a mixture's complexity as well as component weights and parameters based on estimating the determinant of the Hankel matrix of the moments of the mixing distribution and comparing it to determinant values generated by a parametric bootstrap.

paramHankel(obj, j.max = 10, B = 1000, ql = 0.025, 
            qu = 0.975, control = c(trace = 0), ...)

paramHankel.scaled(obj, j.max = 10, B = 100, ql = 0.025, 
                   qu = 0.975, control = c(trace = 0), ...)
                   
## S3 method for class 'paramEst'
print(x, ...)
                   
## S3 method for class 'paramEst'
plot(x, mixture = TRUE, components = TRUE, ylim = NULL, 
     cex.main = 0.9, ...)

`obj`	object of class `datMix`.
`j.max`	integer stating the maximal number of components to be considered.
`B`	integer specifying the number of bootstrap replicates.
`ql`	numeric between 0 and 1 specifying the lower bootstrap quantile to which the observed determinant value will be compared.
`qu`	numeric between 0 and 1 specifying the upper bootstrap quantile to which the observed determinant value will be compared.
`control`	control list of optimization parameters, see `solnp`.
`x`	object of class `paramEst`.
`mixture`	logical indicating whether the estimated mixture density should be drawn, set to `TRUE` by default.
`components`	logical indicating whether the individual mixture components should be drawn, set to `TRUE` by default.
`ylim`	range of y values to use; if not specified (or containing `NA`), the function tries to construct reasonable default values itself.
`cex.main`	The magnification to be used for main titles relative to the current setting of `cex`, see `par`.
`...`	in `paramHankel()` and `paramHankel.scaled()`: further arguments passed to the `boot` function. in `plot.hankDet()`: further arguments passed to the `hist` function plotting the data. in `print.hankDet()`: further arguments passed to the `printCoefmat` function.

Define complexity of a finite mixture F as the smallest integer p, such that its pdf/pmf f can be written as

f(x) = w_1*g(x;θ _1) + … + w_p*g(x;θ _p).

The paramHankel procedure initially assumes the mixture to only contain a single component, setting j = 1, and then sequentially tests p = j versus p = j+1 for j = 1,2, …, until the algorithm terminates. To do so, it determines the MLE for a j-component mixture, generates B parametric bootstrap samples of size n from the distribution the MLE corresponds to and calculates B determinants of the corresponding (j+1)x(j+1) Hankel matrices of the first 2j raw moments of the mixing distribution (for details see nonparamHankel). The null hypothesis H_0: p = j is rejected and j increased by 1 if the determinant value based on the original data lies outside of the interval [ql, qu], a range specified by the ql and qu empirical quantiles of the bootstrapped determinants. Otherwise, j is returned as the complexity estimate.

paramHankel.scaled functions similarly to paramHankel with the exception that the bootstrapped determinants are scaled by the empirical standard deviation of the bootstrap sample. To scale the original determinant, B nonparametric bootstrap samples of size n are generated from the data, the corresponding determinants are calculated and their empirical standard deviation is used.

The MLEs are calculated via the MLE.function attribute (of the datMix object obj) for j = 1, if it is supplied. For all other j (and also for j = 1 in case MLE.function = NULL) the solver solnp is used to calculate the minimum of the negative log likelihood. The initial values supplied to the solver are calculated as follows: the data is clustered into j groups by the function clara and the data corresponding to each group is given to MLE.function (if supplied to the datMix object, otherwise numerical optimization is used here as well). The size of the groups is taken as initial component weights and the MLE's are taken as initial component parameter estimates.

Object of class paramEst with the following attributes:

`dat`	data based on which the complexity is estimated.
`dist`	character string stating the (abbreviated) name of the component distribution, such that the function `ddist` evaluates its density function and `rdist` generates random numbers.
`ndistparams`	integer specifying the number of parameters identifying the component distribution, i.e. if θ \subseteq R^d then `ndistparams` = d.
`formals.dist`	string vector specifying the names of the formal arguments identifying the distribution `dist` and used in `ddist` and `rdist`, e.g. for a gaussian mixture (`dist = norm`) amounts to `mean` and `sd`, as these are the formal arguments used by `dnorm` and `rnorm`.
`discrete`	logical indicating whether the underlying mixture distribution is discrete.
`mle.fct`	attribute `MLE.function` of `obj`.
`pars`	Say the complexity estimate is equal to some j. Then `pars` is a numeric vector of size (d+1)j-1* specifying the component weight and parameter estimates, given as (w_1, ... w_{j-1}, θ 1_1, ... θ 1_j, θ 2_1, ... θ d_j).
`values`	numeric vector of function values gone through during optimization at iteration j, the last entry being the value at the optimum.
`convergence`	indicates whether the solver has converged (0) or not (1 or 2) at iteration j.

nonparamHankel for estimation of the mixture complexity based on the Hankel matrix without parameter estimation, solnp for the solver, datMix for creation of the datMix object.

## create 'Mix' object
poisMix <- Mix("pois", w = c(0.45, 0.45, 0.1), lambda = c(1, 5, 10))


## create random data based on 'Mix' object (gives back 'rMix' object)
set.seed(1)
poisRMix <- rMix(1000, obj = poisMix)


## create 'datMix' object for estimation

# generate list of parameter bounds
poisList <- vector(mode = "list", length = 1)
names(poisList) <- "lambda"
poisList$lambda <- c(0, Inf)

# generate MLE function
MLE.pois <- function(dat){
  mean(dat)
}

# generate function needed for estimating the j^th moment of the 
# mixing distribution via Hankel.method "explicit"

explicit.pois <- function(dat, j){
  res <- 1
  for (i in 0:(j-1)){
    res <- res*(dat-i)
  }
  return(mean(res))
}
        
# generating 'datMix' object
pois.dM <- RtoDat(poisRMix, theta.bound.list = poisList, MLE.function = MLE.pois,
                  Hankel.method = "explicit", Hankel.function = explicit.pois)


## complexity and parameter estimation
set.seed(1)
res <- paramHankel(pois.dM)
plot(res)

anjaweigel/mixComp_package documentation built on Sept. 2, 2020, 3:55 p.m.

anjaweigel/mixComp_package index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

anjaweigel/mixComp_package
Estimation of the Order of Finite Mixture Distributions

paramHankel: Estimate a Mixture's Complexity (and Component...
In anjaweigel/mixComp_package: Estimation of the Order of Finite Mixture Distributions

Description

Usage

Arguments

Details

Value

See Also

Examples

Related to paramHankel in anjaweigel/mixComp_package...

R Package Documentation

Browse R Packages

We want your feedback!

anjaweigel/mixComp_package Estimation of the Order of Finite Mixture Distributions

paramHankel: Estimate a Mixture's Complexity (and Component... In anjaweigel/mixComp_package: Estimation of the Order of Finite Mixture Distributions

Description

Usage

Arguments

Details

Value

See Also

Examples

Related to paramHankel in anjaweigel/mixComp_package...

R Package Documentation

Browse R Packages

We want your feedback!

anjaweigel/mixComp_package
Estimation of the Order of Finite Mixture Distributions

paramHankel: Estimate a Mixture's Complexity (and Component...
In anjaweigel/mixComp_package: Estimation of the Order of Finite Mixture Distributions