rmx: Radius-Minimax Estimators

rowRmx and colRmx

R Documentation

Optimally robust estimation for location and/or scale

Description

The functions rowRmx and colRmx compute optimally robust RMX estimates for (convex) contamination neighborhoods. The definition of these estimators can be found in Kohl (2005) and Rieder et al. (2008), respectively.

Usage

rowRmx(x, model = "norm", eps.lower=0, eps.upper=0.5, eps=NULL, k = 3L, 
       initial.est=NULL, fsCor = NULL, na.rm = TRUE, message = TRUE, 
       computeSE = NULL, ...)

colRmx(x, model = "norm", eps.lower=0, eps.upper=0.5, eps=NULL, k = 3L, 
       initial.est=NULL, fsCor = NULL, na.rm = TRUE, message = TRUE, 
       computeSE = NULL, ...)

rowRmx.norm(x, eps.lower=0, eps.upper, eps = NULL, initial.est = NULL, k = 3L, 
            fsCor = TRUE, na.rm = TRUE, computeSE = TRUE)
            
rowRmx.binom(x, eps.lower=0, eps.upper, eps = NULL, initial.est = NULL, k = 3L, 
             fsCor = FALSE, na.rm = TRUE, size, computeSE = FALSE, 
             parallel = FALSE, ncores = NULL, aUp = 100*size, 
             cUp = 1e4, delta = 1e-9)
             
rowRmx.pois(x, eps.lower=0, eps.upper, eps = NULL, initial.est = NULL, k = 3L, 
            fsCor = FALSE, na.rm = TRUE, computeSE = FALSE, 
            parallel = FALSE, ncores = NULL, aUp = 100*max(x), 
            cUp = 1e4, delta = 1e-9)

Arguments

`x`	matrix or data.frame of (numeric) data values.
`model`	character: short name of the model/distribution (default = `"norm"`); see also details.
`eps.lower`	positive real (0 <= `eps.lower` <= `eps.upper`): lower bound for the amount of gross errors; see details below.
`eps.upper`	positive real (`eps.lower` <= `eps.upper` <= 0.5): upper bound for the amount of gross errors; see details below.
`eps`	positive real (0 < `eps` <= 0.5): amount of gross errors. See details below.
`k`	positive integer: k-step is used to compute the optimally robust estimator.
`initial.est`	initial estimate for `mean` and `sd`. If missing median and MAD are used.
`fsCor`	logical: perform finite-sample correction; see function `fsRadius`.
`na.rm`	logical: if `TRUE`, `NA` values are removed before the estimator is evaluated.
`message`	logical: if `FALSE`, messages are suppressed.
`size`	size parameter (known!); see `dbinom`.
`computeSE`	logical: compute asymptotic standard errors.
`parallel`	if `computeSE = TRUE`: logical: use package parallel for the computation.
`ncores`	if `parallel = TRUE`: number of cores used for the computation. If missing, the maximum number of cores - 1 is used.
`aUp`	numeric: upper limit for centering constant a.
`cUp`	postive real: upper limit for clipping constant c.
`delta`	positive real: desired accuracy (convergence tolerance).
`...`	further arguments passed through; e.g., known parameters such as `size` in case of the binomial model.

Details

These functions are optimized for the situation where one has a matrix and wants to compute the optimally robust RMX estimator for every row, respectively column of this matrix. In particular, the amount of cross errors is assumed to be constant for all rows, respectively columns.

If the amount of gross errors (contamination) is known, it can be specified by eps. The radius of the corresponding infinitesimal contamination neighborhood is obtained by multiplying eps by the square root of the sample size.

If the amount of gross errors (contamination) is unknown, try to find a rough estimate for the amount of gross errors, such that it lies between eps.lower and eps.upper.

As models we have implemented so far:

"norm": normal location and scale. Parameters can be set via argument mean and sd; see examples.
"binom": binomial probability (size known).
"pois": Poisson mean.

Value

An object of class "RMXlist" is returned. It contails at least the following arguments:

`rmxEst`	estimates
`rmxIF`	object of class `optIF`; see `optIF`.
`initial.est`	initial estimates.
`Infos`	matrix with information about the estimator
`x`	data used for the estimation.
`n`	sample size
`eps.lower`	lower bound for the amount of gross errors, if provided otherwise `NA`.
`eps.upper`	upper bound for the amount of gross errors, if provided otherwise `NA`.
`eps`	amount of gross errors, if provided otherwise `NA`.
`fsCor`	finite-sample correction
`k`	k-step construction
`call`	matched call

Author(s)

Matthias Kohl Matthias.Kohl@stamats.de

References

Kohl, M. (2005) Numerical Contributions to the Asymptotic Theory of Robustness. Bayreuth: Dissertation.

Rieder, H. (1994) Robust Asymptotic Statistics. New York: Springer.

Rieder, H., Kohl, M. and Ruckdeschel, P. (2008) The Costs of not Knowing the Radius. Statistical Methods and Applications 17(1) 13-40. Extended version: http://r-kurs.de/RRlong.pdf

M. Kohl, P. Ruckdeschel, and H. Rieder (2010). Infinitesimally Robust Estimation in General Smoothly Parametrized Models. Statistical Methods and Application, 19(3):333-354.

Examples

ind <- rbinom(500, size=1, prob=0.05)
X <- matrix(rnorm(500, mean=ind*3, sd=(1-ind) + ind*9), nrow = 5)
rowRmx(X)
rowRmx(X, eps.lower = 0.01, eps.upper = 0.1)
rowRmx(X, eps.lower = 0.01, eps.upper = 0.1, computeSE = TRUE)
rowRmx(X, eps = 0.05)

X1 <- t(X)
colRmx(X1)
colRmx(X1, eps.lower = 0.01, eps.upper = 0.1)
colRmx(X1, eps.lower = 0.01, eps.upper = 0.1, computeSE = TRUE)
colRmx(X1, eps = 0.05)

## adaptive determination of outlier contamination
RMXlist <- apply(X, 1, rmx, eps.upper = 0.5)
eps.up <- max(sapply(RMXlist, function(x) outlier(x)$prop.outlier))
rowRmx(X, eps.upper = eps.up)
rowRmx(X, eps.upper = eps.up, computeSE = TRUE)

stamats/rmx documentation built on Sept. 29, 2023, 7:13 p.m.