MDEstimator: Function to compute minimum distance estimates

View source: R/MDEstimator.R

MDEstimatorR Documentation

Function to compute minimum distance estimates

Description

The function MDEstimator provides a general way to compute minimum distance estimates.

Usage

MDEstimator(x, ParamFamily, distance = KolmogorovDist, dist.name, 
            paramDepDist = FALSE, startPar = NULL, Infos, trafo = NULL,
            penalty = 1e20, validity.check = TRUE, asvar.fct, na.rm = TRUE,
            ..., .withEvalAsVar = TRUE, nmsffx = "",
            .with.checkEstClassForParamFamily = TRUE)
CvMMDEstimator(x, ParamFamily, muDatOrMod = c("Mod","Dat", "Other"),
            mu = NULL, paramDepDist = FALSE, startPar = NULL, Infos,
            trafo = NULL, penalty = 1e20, validity.check = TRUE, 
            asvar.fct = .CvMMDCovariance, na.rm = TRUE, ...,
            .withEvalAsVar = TRUE, nmsffx = "",
            .with.checkEstClassForParamFamily = TRUE)
KolmogorovMDEstimator(x, ParamFamily, paramDepDist = FALSE, startPar = NULL, Infos, 
            trafo = NULL, penalty = 1e20, validity.check = TRUE, asvar.fct, 
            na.rm = TRUE, ..., .withEvalAsVar = TRUE, nmsffx = "",
            .with.checkEstClassForParamFamily = TRUE)
TotalVarMDEstimator(x, ParamFamily, paramDepDist = FALSE, startPar = NULL, Infos, 
            trafo = NULL, penalty = 1e20, validity.check = TRUE, asvar.fct, 
            na.rm = TRUE, ..., .withEvalAsVar = TRUE, nmsffx = "",
            .with.checkEstClassForParamFamily = TRUE)
HellingerMDEstimator(x, ParamFamily, paramDepDist = FALSE, startPar = NULL, Infos, 
            trafo = NULL, penalty = 1e20, validity.check = TRUE, asvar.fct, 
            na.rm = TRUE, ..., .withEvalAsVar = TRUE, nmsffx = "",
            .with.checkEstClassForParamFamily = TRUE)
CvMDist2(e1,e2,... )

Arguments

x

(empirical) data

ParamFamily

object of class "ParamFamily"

distance

(generic) function: to compute distance beetween (emprical) data and objects of class "Distribution".

dist.name

optional name of distance

muDatOrMod

a character string specifying whether as integration measure mu in Cramer-von-Mises distance, the empirical cdf (corresponding to argument value "Dat") or the current model distribution (corresponding to argument value "Mod") or a given integration (probability) measure / distribution mu (corresponding to argument value "Other") is to be used; must be one of "Dat" (default) or "Mod" or "Other". You can specify just the initial letter; the default is "Mod".

mu

optional integration (probability) measure for CvM MDE. defaults to NULL and is ignored in options muDatOrMod in "Dat" and "Mod"; in case "Other", it must be of class UnivariateDistribution.

paramDepDist

logical; will computation of distance be parameter dependent (see also note below)? if TRUE, distance function must be able to digest a parameter thetaPar; otherwise this parameter will be eliminated if present in ...-argument.

startPar

initial information used by optimize resp. optim; i.e; if (total) parameter is of length 1, startPar is a search interval, else it is an initial parameter value; if NULL slot startPar of ParamFamily is used to produce it; in the multivariate case, startPar may also be of class Estimate, in which case slot untransformed.estimate is used.

Infos

character: optional informations about estimator

trafo

an object of class MatrixorFunction – a transformation for the main parameter

penalty

(non-negative) numeric: penalizes non valid parameter-values

validity.check

logical: shall return parameter value be checked for validity? Defaults to yes (TRUE)

asvar.fct

optionally: a function to determine the corresponding asymptotic variance; if given, asvar.fct takes arguments L2Fam((the parametric model as object of class L2ParamFamily)) and param (the parameter value as object of class ParamFamParameter); arguments are called by name; asvar.fct may also process further arguments passed through the ... argument

na.rm

logical: if TRUE, the estimator is evaluated at complete.cases(x).

...

for the estimators: further arguments to criterion or optimize or optim, respectively; for CvMDist2, these can be used e.g. by E().

.withEvalAsVar

logical: shall slot asVar be evaluated (if asvar.fct is given) or just the call be returned?

nmsffx

character: a potential suffix to be appended to the estimator name.

e1

object of class "Distribution" or class "numeric"

e2

object of class "Distribution"

.with.checkEstClassForParamFamily

logical: Should a the end of the function .checkEstClassForParamFamily; defaults to TRUE; can be switched off for computational time or because this is already checked in a calling wrapper function.

Details

The argument distance has to be a (generic) function with arguments the empirical data as well as an object of class "Distribution" and possibly ...; e.g. KolmogorovDist (default), TotalVarDist or HellingerDist. Uses mceCalc for method dispatch.

The functions CvMMDEstimator, KolmogorovMDEstimator, TotalVarMDEstimator, and HellingerMDEstimator are aliases where the distance is fixed. More specifically, CvMMDEstimator uses Cramer-von-Mises distance, see CvMDist with integration measure mu either equal to the empirical cdf or to the current best fitting model distribution; the alternative is selected by argument muDatOrMod). As it is asymptotically linear, asymptotic variances are available. In case of alternative "Dat", this variance is computed by means of helper function .CvMMDCovarianceWithMux, case of alternative "Mod" we use helper function .CvMMDCovariance. In both case one may use these helper function to get hand on the respective influence function. For covariances computed by .CvMMDCovariance, diagnostics on the involved integrations are available if argument diagnostic is TRUE. Then there is attribute diagnostic attached to the return value, which may be inspected and accessed through showDiagnostic and getDiagnostic.

KolmogorovMDEstimator uses Kolmogorov distance, see KolmogorovDist, TotalVarMDEstimator, uses total variation distance, see TotalVarDist and HellingerMDEstimator uses Hellinger distance, see HellingerDist.

Function CvMDist2 calls CvMDist and computes the Cramer-von-Mises distance between distributions e1 and e2 with integration measure mu equal to e2; it is used in alternative "Mod" in CvMMDEstimator.

Value

The estimators return an object of S4-class "MCEstimate" which inherits from class "Estimate". CvMDist2 returns the respective distance.

Theoretical Background

It should be noted that CvMMDEstimator results in an asymptotically linear (hence asymptotically normal) estimator with an influence function which is always bounded; HellingerMDEstimator adapts, for growing sample size, the MLE estimator, hence is asymptotically efficient, while for finite sample size is bias robust. KolmogorovMDEstimator is square-root-n consistent but, due to the facetted level sets of the distance fails to be asymptotically normal. In the terminology of Donoho/Liu, TotalVarMDEstimator and HellingerMDEstimator rely on strong distances, while CvMMDEstimator and KolmogorovMDEstimator use weak distances, so the latter ensure protection against larger classes of contamination (simply because the distribution balls based on the respective distances contain more elements).

Note

The distance function may be called together with a parameter thetaPar which is the current parameter value under consideration, i.e.; the value under which the model distribution is considered. Hence, if desired, particular distance functions could make use of this information, by, say computing the distance differently for different parameter values.

Author(s)

Matthias Kohl Matthias.Kohl@stamats.de,
Peter Ruckdeschel peter.ruckdeschel@uni-oldenburg.de

References

Beran, R. (1977). Minimum Hellinger distance estimates for parametric models. Annals of Statistics, 5(3), 445-463.

Donoho, D.L. and Liu, R.C. (1988). The "automatic" robustness of minimum distance functionals. Annals of Statistics, 16(2), 552-586.

Huber, P.J. (1981) Robust Statistics. New York: Wiley.

Parr, W.C. and Schucany, W.R. (1980). Minimum distance and robust estimation. Journal of the American Statistical Association, 75(371), 616-624.

Rao, P.V., Schuster, E.F., and Littell, R.C. (1975). Estimation of Shift and Center of Symmetry Based on Kolmogorov-Smirnov Statistics. Annals of Statistics, 3, 862-873.

Rieder, H. (1994) Robust Asymptotic Statistics. New York: Springer.

See Also

ParamFamily-class, ParamFamily, MCEstimator, MCEstimate-class, fitdistr

Examples

## (empirical) Data
set.seed(123)
x <- rgamma(50, scale = 0.5, shape = 3)

## parametric family of probability measures
G <- GammaFamily(scale = 1, shape = 2)

## Kolmogorov(-Smirnov) minimum distance estimator
MDEstimator(x = x, ParamFamily = G, distance = KolmogorovDist)
## or
KolmogorovMDEstimator(x = x, ParamFamily = G)

## von Mises minimum distance estimator with default mu = Mod
MDEstimator(x = x, ParamFamily = G, distance = CvMDist)


### these examples take too much time for R CMD check --as-cran

## von Mises minimum distance estimator with default mu = Mod
MDEstimator(x = x, ParamFamily = G, distance = CvMDist,
            asvar.fct = .CvMMDCovarianceWithMux)
## or
CvMMDEstimator(x = x, ParamFamily = G)
## or
CvMMDEstimator(x = x, ParamFamily = G, muDatOrMod="Mod")

## or with data based integration measure:
CvMMDEstimator(x = x, ParamFamily = G, muDatOrMod="Dat")

## von Mises minimum distance estimator with mu = N(0,1)
MDEstimator(x = x, ParamFamily = G, distance = CvMDist, mu = Norm())
## or, with asy Var
MDEstimator(x = x, ParamFamily = G, distance = CvMDist, mu = Norm(),
            asvar.fct = function(L2Fam, param, ...){
            .CvMMDCovariance(L2Fam=L2Fam, param=param, mu=Norm(), N = 400)
            } )
## synomymous to
CvMMDEstimator(x = x, ParamFamily = G, muDatOrMod="Other", mu = Norm())

## Total variation minimum distance estimator
## gamma distributions are discretized
MDEstimator(x = x, ParamFamily = G, distance = TotalVarDist)
## or
TotalVarMDEstimator(x = x, ParamFamily = G)
## or smoothing of emprical distribution (takes some time!)
#MDEstimator(x = x, ParamFamily = G, distance = TotalVarDist, asis.smooth.discretize = "smooth")

## Hellinger minimum distance estimator
## gamma distributions are discretized
distroptions(DistrResolution = 1e-10)
MDEstimator(x = x, ParamFamily = G, distance = HellingerDist, startPar = c(1,2))
## or
HellingerMDEstimator(x = x, ParamFamily = G, startPar = c(1,2))
distroptions(DistrResolution = 1e-6) # default
## or smoothing of emprical distribution (takes some time!)
MDEstimator(x = x, ParamFamily = G, distance = HellingerDist, asis.smooth.discretize = "smooth")


distrMod documentation built on Nov. 16, 2022, 9:07 a.m.