rowRmx: Optimally robust estimation for location and/or scale

View source: R/rowRmx.R

rowRmx and colRmxR Documentation

Optimally robust estimation for location and/or scale

Description

The functions rowRmx and colRmx compute optimally robust RMX estimates for (convex) contamination neighborhoods. The definition of these estimators can be found in Kohl (2005) and Rieder et al. (2008), respectively.

Usage

rowRmx(x, model = "norm", eps.lower=0, eps.upper=0.5, eps=NULL, k = 3L, 
       initial.est=NULL, fsCor = NULL, na.rm = TRUE, message = TRUE, 
       computeSE = NULL, ...)

colRmx(x, model = "norm", eps.lower=0, eps.upper=0.5, eps=NULL, k = 3L, 
       initial.est=NULL, fsCor = NULL, na.rm = TRUE, message = TRUE, 
       computeSE = NULL, ...)

rowRmx.norm(x, eps.lower=0, eps.upper, eps = NULL, initial.est = NULL, k = 3L, 
            fsCor = TRUE, na.rm = TRUE, computeSE = TRUE)
            
rowRmx.binom(x, eps.lower=0, eps.upper, eps = NULL, initial.est = NULL, k = 3L, 
             fsCor = FALSE, na.rm = TRUE, size, computeSE = FALSE, 
             parallel = FALSE, ncores = NULL, aUp = 100*size, 
             cUp = 1e4, delta = 1e-9)
             
rowRmx.pois(x, eps.lower=0, eps.upper, eps = NULL, initial.est = NULL, k = 3L, 
            fsCor = FALSE, na.rm = TRUE, computeSE = FALSE, 
            parallel = FALSE, ncores = NULL, aUp = 100*max(x), 
            cUp = 1e4, delta = 1e-9)

Arguments

x

matrix or data.frame of (numeric) data values.

model

character: short name of the model/distribution (default = "norm"); see also details.

eps.lower

positive real (0 <= eps.lower <= eps.upper): lower bound for the amount of gross errors; see details below.

eps.upper

positive real (eps.lower <= eps.upper <= 0.5): upper bound for the amount of gross errors; see details below.

eps

positive real (0 < eps <= 0.5): amount of gross errors. See details below.

k

positive integer: k-step is used to compute the optimally robust estimator.

initial.est

initial estimate for mean and sd. If missing median and MAD are used.

fsCor

logical: perform finite-sample correction; see function fsRadius.

na.rm

logical: if TRUE, NA values are removed before the estimator is evaluated.

message

logical: if FALSE, messages are suppressed.

size

size parameter (known!); see dbinom.

computeSE

logical: compute asymptotic standard errors.

parallel

if computeSE = TRUE: logical: use package parallel for the computation.

ncores

if parallel = TRUE: number of cores used for the computation. If missing, the maximum number of cores - 1 is used.

aUp

numeric: upper limit for centering constant a.

cUp

postive real: upper limit for clipping constant c.

delta

positive real: desired accuracy (convergence tolerance).

...

further arguments passed through; e.g., known parameters such as size in case of the binomial model.

Details

These functions are optimized for the situation where one has a matrix and wants to compute the optimally robust RMX estimator for every row, respectively column of this matrix. In particular, the amount of cross errors is assumed to be constant for all rows, respectively columns.

If the amount of gross errors (contamination) is known, it can be specified by eps. The radius of the corresponding infinitesimal contamination neighborhood is obtained by multiplying eps by the square root of the sample size.

If the amount of gross errors (contamination) is unknown, try to find a rough estimate for the amount of gross errors, such that it lies between eps.lower and eps.upper.

As models we have implemented so far:

  1. "norm": normal location and scale. Parameters can be set via argument mean and sd; see examples.

  2. "binom": binomial probability (size known).

  3. "pois": Poisson mean.

Value

An object of class "RMXlist" is returned. It contails at least the following arguments:

rmxEst

estimates

rmxIF

object of class optIF; see optIF.

initial.est

initial estimates.

Infos

matrix with information about the estimator

x

data used for the estimation.

n

sample size

eps.lower

lower bound for the amount of gross errors, if provided otherwise NA.

eps.upper

upper bound for the amount of gross errors, if provided otherwise NA.

eps

amount of gross errors, if provided otherwise NA.

fsCor

finite-sample correction

k

k-step construction

call

matched call

Author(s)

Matthias Kohl Matthias.Kohl@stamats.de

References

Kohl, M. (2005) Numerical Contributions to the Asymptotic Theory of Robustness. Bayreuth: Dissertation.

Rieder, H. (1994) Robust Asymptotic Statistics. New York: Springer.

Rieder, H., Kohl, M. and Ruckdeschel, P. (2008) The Costs of not Knowing the Radius. Statistical Methods and Applications 17(1) 13-40. Extended version: http://r-kurs.de/RRlong.pdf

M. Kohl, P. Ruckdeschel, and H. Rieder (2010). Infinitesimally Robust Estimation in General Smoothly Parametrized Models. Statistical Methods and Application, 19(3):333-354.

See Also

rmx, optIF, fsRadius

Examples

ind <- rbinom(500, size=1, prob=0.05)
X <- matrix(rnorm(500, mean=ind*3, sd=(1-ind) + ind*9), nrow = 5)
rowRmx(X)
rowRmx(X, eps.lower = 0.01, eps.upper = 0.1)
rowRmx(X, eps.lower = 0.01, eps.upper = 0.1, computeSE = TRUE)
rowRmx(X, eps = 0.05)

X1 <- t(X)
colRmx(X1)
colRmx(X1, eps.lower = 0.01, eps.upper = 0.1)
colRmx(X1, eps.lower = 0.01, eps.upper = 0.1, computeSE = TRUE)
colRmx(X1, eps = 0.05)

## adaptive determination of outlier contamination
RMXlist <- apply(X, 1, rmx, eps.upper = 0.5)
eps.up <- max(sapply(RMXlist, function(x) outlier(x)$prop.outlier))
rowRmx(X, eps.upper = eps.up)
rowRmx(X, eps.upper = eps.up, computeSE = TRUE)

stamats/rmx documentation built on Sept. 29, 2023, 7:13 p.m.