Description Usage Arguments Details Value Author(s) References Examples
Computes a fast and robust multivariate outlyingness index for a n by p matrix of multivariate continuous data.
1  FastPCS(x,nSamp,alpha=0.5,seed=1)

x 
A numeric n (n>5*p) by p (p>1) matrix or data frame. 
nSamp 
A positive integer giving the number of resamples required;

alpha 
Numeric parameter controlling the size of the active subsets,
i.e., 
seed 
Starting value for random generator. A positive integer. Default is seed = 1 
The current version of FastPCS includes the use of a Cstep procedure to improve efficiency (Rousseeuw and van Driessen (1999)). Csteps are taken after the raw subset (H*) as been chosen (according to the Iindex) and before reweighting. In experiments, we found that carrying CSteps
starting from the members of $rawBest
improves the speed of convergence without increasing the bias
of the final estimates. FastPCS is affine equivariant (Schmitt et al. (2014)) and thus consistent at the
elliptical model (Maronna et al., (2006) p. 217).
alpha 
The value of alpha used. 
nSamp 
The value of nSamp used. 
obj 
The value of the FastPCS objective function of the optimal h subset. 
rawBest 
The index of the h observation with smallest outlyingness indexes. 
itembestThe index of the observations with outlyingness smaller than the rejection threshold after Csteps are taken.
center 
The mean vector of the observations with outlyingness smaller than the rejection threshold after Csteps are taken. 
cov 
Covariance matrix of the observations with outlyingness smaller than the rejection threshold after Csteps are taken. 
distance 
The statistical distance of each observation wrt the center vector and cov matrix of the observations with outlyingness smaller than the rejection threshold after Csteps are taken. 
Kaveh Vakili
Maronna, R. A., Martin R. D. and Yohai V. J. (2006). Robust Statistics: Theory and Methods. Wiley, New York.
P. J. Rousseeuw and K. van Driessen (1999). A fast algorithm for the minimum covariance determinant estimator. Technometrics 41, 212–223.
Eric Schmitt, Viktoria Oellerer, Kaveh Vakili (2014). The finite sample breakdown point of PCS Statistics and Probability Letters, Volume 94, Pages 214220.
Vakili, K. and Schmitt, E. (2014). Finding multivariate outliers with FastPCS. Computational Statistics \& Data Analysis. Vol. 69, pp 54–66. (http://arxiv.org/abs/1301.2053)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51  ## testing outlier detection
set.seed(123)
n<100
p<3
x0<matrix(rnorm(n*p),nc=p)
x0[1:30,]<matrix(rnorm(30*p,4.5,1/100),nc=p)
z<c(rep(0,30),rep(1,70))
nstart<FPCSnumStarts(p=p,eps=0.4)
results<FastPCS(x=x0,nSamp=nstart)
z[results$best]
## testing outlier detection, different value of alpha
set.seed(123)
n<100
p<3
x0<matrix(rnorm(n*p),nc=p)
x0[1:20,]<matrix(rnorm(20*p,4.5,1/100),nc=p)
z<c(rep(0,20),rep(1,80))
nstart<FPCSnumStarts(p=p,eps=0.25)
results<FastPCS(x=x0,nSamp=nstart,alpha=0.75)
z[results$best]
#testing exact fit
set.seed(123)
n<100
p<3
x0<matrix(rnorm(n*p),nc=p)
x0[1:30,]<matrix(rnorm(30*p,5,1/100),nc=p)
x0[31:100,3]<x0[31:100,2]*2+1
z<c(rep(0,30),rep(1,70))
nstart<FPCSnumStarts(p=p,eps=0.4)
results<FastPCS(x=x0,nSamp=nstart)
z[results$rawBest]
results$obj
#testing affine equivariance
n<100
p<3
set.seed(123)
x0<matrix(rnorm(n*p),nc=p)
nstart<500
results1<FastPCS(x=x0,nSamp=nstart,seed=1)
a1<matrix(0.9,p,p)
diag(a1)<1
x1<x0%*%a1
results2<FastPCS(x=x1,nSamp=nstart,seed=1)
results2$center
results2$cov
#should be the same
results1$center%*%a1
a1

