Description Usage Arguments Value Author(s) References Examples
Computes a robust PCA model with q components for an n by p matrix of multivariate data using the FastHCS algorithm.
1 |
x |
A numeric n (n>5*q) by p (p>1) matrix or data frame. |
nSamp |
A positive integer giving the number of resamples required;
|
alpha |
Numeric parameter controlling the size of the active subsets i.e., |
q |
Number of principal components to compute. Note that p>q>1, 1<q<n. Default is q=10. |
seed |
Starting value for random generator. Default is seed = 1. |
A list with components:
rawBest: |
The indexes of the h members of H*, the raw FastHCS optimal subset. |
obj: |
The FastHCS objective function corresponding to H*, the selected subset of h observations. |
rawDist: |
Outlyingness index of the data on the raw q-dimensonal subset that initialized H*. |
best: |
the indexes of the members of the H+, the FastHSC subset after the C-steps. |
center: |
the p-vector of column means of the observations with indexes in |
loadings: |
the (rank q) loadings matrix of the observations with indexes in |
eigenvalues: |
the first |
od: |
the orthogonal distances of the centered data wrt to the subspace spanned
by the |
sd: |
the score distances of the data projected on the subspace spanned by the |
cutoff.od: |
the cutoff for the vector of orthogonal distances. |
cutoff.sd: |
the cutoff for the vector of score distances. |
scores |
The value of the projected on the space of the principal components data (the centred data multiplied by the loadings matrix) is returned. Hence, cov(scores) is the diagonal matrix diag(eigenvalues). |
Kaveh Vakili, Eric Schmitt
Schmitt E. and Vakili K. and (2015). Robust PCA with FastHCS. (http://arxiv.org/abs/1402.3514)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 | ## testing outlier detection
n<-100
p<-30
Q<-5
set.seed(123)
x0<-matrix(rnorm(n*p),nc=p)
x0[1:30,]<-matrix(rnorm(30*p,4.5,1/100),nc=p)
z<-c(rep(0,30),rep(1,70))
nStarts<-FHCSnumStarts(q=Q,eps=0.4)
Fit<-FastHCS(x=x0,nSamp=nStarts,q=Q)
z[Fit$best]
plot(Fit,col=(!z)+1,pch=16)
## testing outlier detection, different value of alpha
n<-100
p<-30
Q<-5
set.seed(123)
x0<-matrix(rnorm(n*p),nc=p)
x0[1:20,]<-matrix(rnorm(20*p,4.5,1/100),nc=p)
z<-c(rep(0,20),rep(1,80))
nStarts<-FHCSnumStarts(q=Q,eps=0.25)
Fit<-FastHCS(x=x0,nSamp=nStarts,q=Q,alpha=0.75)
z[Fit$best]
#testing exact fit
n<-100
p<-5
Q<-4
set.seed(123)
x0<-matrix(rnorm(n*p),nc=p)
x0[1:30,]<-matrix(rnorm(30*p,4.5,1/100),nc=p)
x0[31:100,4:5]<-x0[31:100,2]
z<-c(rep(0,30),rep(1,70))
nStart<-FHCSnumStarts(q=Q,eps=0.4)
results<-FastHCS(x=x0,nSamp=nStart,q=Q)
z[results$best]
results$obj
#testing rotation equivariance
n<-100
p<-10
Q<-3
set.seed(123)
x0<-scale(matrix(rnorm(n*p),nc=p))
A<-diag(rep(1,p))
A[1:2,1:2]<-c(0,1,-1,0)
x1<-x0%*%A
nStart<-FHCSnumStarts(q=Q,eps=0.4)
r0<-FastHCS(x=x0,nSamp=nStart,q=Q,seed=0)
r1<-FastHCS(x=x1,nSamp=nStart,q=Q,seed=0)
max(abs(log(r1$eigenvalues[1:Q]/r0$eigenvalues[1:Q])))
|
Loading required package: matrixStats
Loading required package: robustbase
Attaching package: 'robustbase'
The following objects are masked from 'package:matrixStats':
colMedians, rowMedians
[1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
[39] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
[1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
[39] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
[1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
[39] 1 1 1 1 1 1 1 1 1 1 1 1 1 1
[1] 0
[1] 0
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.