Robust Covariance Estimation via Nearest Neighbor Cleaning

Share:

Description

covNNC() estimates robust covariance/dispersion matrices by the nearest neighbor variance estimation (NNVE) or (rather) “Nearest Neighbor Cleaning” (NNC) method of Wang and Raftery (2002, JASA).

Usage

1
2
covNNC(X, k = min(12, n - 1), pnoise = 0.05, emconv = 0.001,
       bound = 1.5, extension = TRUE, devsm = 0.01)

Arguments

X

matrix in which each row represents an observation or point and each column represents a variable.

k

desired number of nearest neighbors (default is 12)

pnoise

percent of added noise

emconv

convergence tolerance for EM

bound

value used to identify surges in variance caused by outliers wrongly included as signal points (bound = 1.5 means a 50 percent increase)

extension

whether or not to continue after reaching the last chi-square distance. The default is to continue, which is indicated by setting extension = TRUE.

devsm

when extension = TRUE, the algorithm stops if the relative difference in variance is less than devsm. (default is 0.01)

Value

A list with components

cov

covariance matrix

mu

mean vector

postprob

posterior probability

classification

classification (0=noise otherwise 1) obtained by rounding postprob

innc

list of initial nearest neighbor cleaning results (components are the covariance, mean, posterior probability and classification)

Note

Terms of use: GPL version 2 or newer.

MM: Even though covNNC() is backed by a serious scientific publication, I cannot recommend its use at all.

Author(s)

Naisyin Wang nwang@stat.tamu.edu and Adrian Raftery raftery@stat.washington.edu with contributions from Chris Fraley fraley@stat.washington.edu.

covNNC(), then named cov.nnve(), used to be (the only function) in CRAN package covRobust (2003), which was archived in 2012.

Martin Maechler allowed ncol(X) == 1, sped up the original code, by reducing the amount of scaling; further, the accuracy was increased (using internal q.dDk()).

References

Wang, N. and Raftery, A. (2002) Nearest neighbor variance estimation (NNVE): Robust covariance estimation via nearest neighbor cleaning (with discussion). Journal of the American Statistical Association 97, 994–1019.

see also University of Washington Statistics Technical Report 368 (2000) http://www.stat.washington.edu/www/research/reports

See Also

cov.mcd from package MASS; covMcd, and covOGK from package robustbase.

The whole package rrcov.

Examples

1
2
3
4
5
6
data(iris)
covNNC(iris[-5])

data(hbk)
hbk.x <- data.matrix(hbk[, 1:3])
covNNC(hbk.x)