Description Usage Arguments Details Value References See Also Examples
Estimation of a mixture's complexity based on estimating the determinant of the Hankel matrix of the moments of the mixing distribution. The estimated determinants can be scaled and/or penalized.
1 2 3 4 5 6 7 8 9 |
obj |
object of class |
j.max |
integer specifying the maximal number of components to be considered. |
pen.function |
a function with arguments |
scaled |
logical specifying whether the vector of estimated determinants should be scaled. |
B |
integer specifying the number of bootstrap replicates used for scaling of the determinants. Ignored if |
x |
object of class |
type |
character denoting type of plot, see, e.g. |
xlab,ylab |
labels for the x and y axis with defaults (the default for |
mar |
numerical vector of the form c(bottom, left, top, right) which gives the number of lines of margin to be specified on the four sides of the plot, see |
ylim |
range of y values to use. |
... |
|
Define the complexity of a finite mixture F as the smallest integer p, such that its pdf/pmf f can be written as
f(x) = w_1*g(x;θ _1) + … + w_p*g(x;θ _p).
nonparamHankel
estimates p by iteratively increasing the assumed complexity j and calculating the determinant of the (j+1)x(j+1) Hankel matrix made up of the first 2j raw moments of the mixing distribution. As shown by Dacunha-Castelle & Gassiat (1997), once the correct complexity is reached (i.e. for all j >= p), this determinant is zero.
This suggests an estimation procedure for p based on initially finding a consistent estimator of the moments of the mixing distribution and then choosing the estimator estim_p as the value of j which yields a sufficiently small value of the
determinant. Since the estimated determinant is close to 0 for all j >= p, this could lead to choosing estim_p rather larger than the true value. The function therefore returns all estimated determinant values corresponding to complexities up to j.max
,
so that the user can pick the lowest j generating a sufficiently small determinant. In addition, the function allows the inclusion of a penalty term as a function of the sample size n
and the currently assumed complexity j
which will be added to the determinant value (by supplying pen.function
), and/or scaling of the determinants (by setting scaled = TRUE
). For scaling, a nonparametric bootstrap is used to calculate the covariance of the estimated determinants, with B
being the size of the bootstrap sample. The inverse of the square root of this covariance matrix (i.e. the matrix S^(-1) such that $A = SS$, where A is the covariance matrix) is then multiplied with the estimated determinant vector to get the scaled determinant vector.
For a thorough discussion of the methods that can be used for the estimation of the moments see the details section of datMix
.
The vector of estimated determinants (optionally scaled and/or penalized), given back as an object of class hankDet
with the following attributes:
scaled |
logical indicating whether the determinants are scaled. |
pen |
logical indicating whether a penalty was added to the determinants. |
dist |
character string stating the (abbreviated) name of the component distribution, such that the function |
D. Dacunha-Castelle and E. Gassiat, "The estimation of the order of a mixture model", Bernoulli, Volume 3, Number 3, 279-299, 1997.
paramHankel
for a similar approach which estimates the component weights and parameters on top of the complexity,
datMix
for the creation of the datMix
object.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | ## create 'Mix' object
geomMix <- Mix("geom", w = c(0.1, 0.6, 0.3), prob = c(0.8, 0.2, 0.4))
## create random data based on 'Mix' object (gives back 'rMix' object)
set.seed(1)
geomRMix <- rMix(1000, obj = geomMix)
## create 'datMix' object for estimation
# explicit function giving the estimate for the j^th moment of the
# mixing distribution, needed for Hankel.method "explicit"
explicit.fct.geom <- function(dat, j){
1 - ecdf(dat)(j - 1)
}
## generating 'datMix' object
geom.dM <- RtoDat(geomRMix, Hankel.method = "explicit",
Hankel.function = explicit.fct.geom)
## function for penalization
pen <- function(j, n){
(j*log(n))/(sqrt(n))
}
## estimate determinants
set.seed(1)
geomdets_pen <- nonparamHankel(geom.dM, pen.function = pen, j.max = 5)
plot(geomdets_pen, main = "Three component geometric mixture")
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.