Description Usage Arguments Details Value Author(s) References See Also Examples
Performs principal components analysis based on the robust S-estimate of the shape matrix. Additionally uses the Fast and Robust Bootstrap method to compute inference measures such as standard errors and confidence intervals.
1 2 3 4 5 6 |
formula |
an object of class |
data |
data frame from which variables specified in formula are to be taken. |
Y |
matrix or data frame. |
R |
number of bootstrap samples. Default is |
bdp |
required breakdown point for the S-estimates. Should have 0 < |
conf |
level of the bootstrap confidence intervals. Default is |
control |
a list with control parameters for tuning the computing algorithm, see |
na.action |
a function which indicates what should happen when the data contain NAs. Defaults to |
... |
allows for specifying control parameters directly instead of via |
Multivariate S-estimates were introduced by Davies (1987) and can be highly robust while enjoying a reasonable Gaussian efficiency.
The loss function used here is Tukey's biweight. It will be tuned in order to achieve the required breakdown point bdp
(any value between 0 and 0.5).
The MM-estimates are computed by a call to the implementation of the fast-S algorithm (Salibian-Barrera and Yohai 2006) in the rrcov package of Todorov and Filzmoser (2009). Scontrol
provides some adjustable tuning parameters regarding the algorithm. The result of this call is also returned as the value est
.
PCA is performed by computing the eigenvalues (eigval
) and eigenvectors (eigvec
) of the S-estimate of shape, which is a rescaled version of the
S-estimate of covariance (rescaled to have determinant equal to 1). With pvar
the function also provides the estimates for the percentage of
variance explained by the first k principal components, which are simply the cumulative proportions of the eigenvalues sum.
Here, k ranges from 1 to p-1 (with p the number of variables in Y
).
The eigenvectors are always given in the order of descending eigenvalues.
The Fast and Robust Bootstrap (Salibian-Barrera and Zamar 2002) is used to calculate standard errors, and also so-called
basic bootstrap confidence intervals and bias corrected and accelerated (BCa) confidence intervals (Davison and Hinkley 1997, p.194 and p.204 respectively) corresponding
to the estimates eigval
, eigvec
and pvar
.
The bootstrap is also used to estimate the average angles between true and estimated eigenvectors, returned as avgangle
.
See Salibian-Barrera, Van Aelst and Willems (2006).
The fast and robust bootstrap computations for the S-estimates are performed by Sboot_loccov
() and its raw result can be found in bootest
.
The actual bootstrap values of the PCA-related quantities can be found in eigval.boot
, eigvec.boot
and pvar.boot
, where each column
represents a bootstrap sample. For eigvec.boot
, the eigenvectors are stacked on top of each other and the same goes for
eigvec.CI.bca
and eigvec.CI.basic
which hold the confidence limits.
The two columns in the confidence limits always respectively represent the lower and upper limits. For the percentage of variance the function also provides one-sided confidence intervals ([-infty upper]), which can be used to test the hypothesis that the true percentage at least equals a certain value.
Bootstrap samples are discarded if the fast and robust covariance estimate is not positive definite, such that the actual number
of recalculations used can be lower than R
. This actual number equals R
- failedsamples
. However, if
more than 0.75R
of the bootstrap shape estimates is non-positive definite, the failed bootstrap samples are recovered
by applying the make.positive.definite
function (from package corpcor
). If this also fails, the corresponding bootstrap sample
is discarded after all, but such situation should be rare.
This recovery may have an impact on the confidence limits and standard errors of especially the smallest eigenvalues in
eigval
and pvar
.
An object of class FRBpca
, which contains the following components:
shape |
(p x p) S-estimate of the shape matrix of |
eigval |
(p x 1) eigenvalues of S shape |
eigvec |
(p x p) eigenvectors of S-shape |
pvar |
(p-1 x 1) percentages of variance for S eigenvalues |
eigval.boot |
(p x R) eigenvalues of S shape |
eigvec.boot |
(p*p x R) eigenvectors of S-shape (vectorized) |
pvar.boot |
(p-1 x R) percentages of variance for S eigenvalues |
eigval.SE |
(p x 1) bootstrap standard error for S eigenvalues |
eigvec.SE |
(p x p) bootstrap standard error for S eigenvectors |
pvar.SE |
(p-1 x 1) bootstrap standard error for percentage of variance for S eigenvalues |
angles |
(p x R) angles between bootstrap eigenvectors and original S eigenvectors (in radians; in [0 pi/2]) |
avgangle |
(p x 1) average angles between bootstrap eigenvectors and original S eigenvectors (in radians; in [0 pi/2]) |
eigval.CI.bca |
(p x 2) BCa intervals for S eigenvalues |
eigvec.CI.bca |
(p*p x 2) BCa intervals for S eigenvectors (vectorized) |
pvar.CI.bca |
(p-1 x 2) BCa intervals for percentage of variance for S-eigenvalues |
pvar.CIone.bca |
(p-1 x 1) one-sided BCa intervals for percentage of variance for S-eigenvalues ([-infty upper]) |
eigval.CI.basic |
(p x 2) basic bootstrap intervals for S eigenvalues |
eigvec.CI.basic |
(p*p x 2) basic bootstrap intervals for S eigenvectors (vectorized) |
pvar.CI.basic |
(p-1 x 2) basic bootstrap intervals for percentage of variance for S-eigenvalues |
pvar.CIone.basic |
(p-1 x 1) one-sided basic bootstrap intervals for percentage of variance for S-eigenvalues ([-infty upper]) |
est |
list containing the S-estimates of location and scatter |
bootest |
(list) result of |
failedsamples |
number of bootstrap samples with non-positive definiteness of shape |
conf |
a copy of the |
method |
a character string giving the robust PCA method that was used |
w |
implicit weights corresponding to the S-estimates (i.e. final weights in the RWLS procedure at the end of the fast-S algorithm) |
outFlag |
outlier flags: 1 if the robust distance of the observation exceeds the .975 quantile of (the square root of)
the chi-square distribution with degrees of freedom equal to the dimension of |
Y |
copy of the data argument as a matrix |
Gert Willems, Stefan Van Aelst and Ella Roelant
P.L. Davies (1987) Asymptotic behavior of S-estimates of multivariate location parameters and dispersion matrices. The Annals of Statistics, 15, 1269-1292.
A.C. Davison and D.V. Hinkley (1997) Bootstrap Methods and their Application. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge: Cambridge University Press.
M. Salibian-Barrera, S. Van Aelst and G. Willems (2006) PCA based on multivariate MM-estimators with fast and robust bootstrap. Journal of the American Statistical Association, 101, 1198-1211.
M. Salibian-Barrera, S. Van Aelst and G. Willems (2008) Fast and robust bootstrap. Statistical Methods and Applications, 17, 41-71.
M. Salibian-Barrera, R.H. Zamar (2002) Bootstrapping robust estimates of regression. The Annals of Statistics, 30, 556-582.
V. Todorov and P. Filzmoser (2009) An object-oriented framework for robust multivariate analysis. Journal of Statistical Software, 32, 1–47. URL http://www.jstatsoft.org/v32/i03/.
S. Van Aelst and G. Willems (2013). Fast and robust bootstrap for multivariate inference: The R package FRB. Journal of Statistical Software, 53(3), 1–32. URL: http://www.jstatsoft.org/v53/i03/.
plot.FRBpca
, summary.FRBpca
, print.FRBpca
, FRBpcaMM
,
Sboot_loccov
, Scontrol
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
data(ForgedBankNotes)
Spcares <- FRBpcaS(ForgedBankNotes, R=999, bdp=0.25, conf=0.95)
# or using the formula interface
## Not run: Spcares <- FRBpcaMM(~.,data=ForgedBankNotes, R=999, conf=0.95)
# the simple print method shows the standard deviations with confidence limits:
Spcares
# the summary functions shows a lot more (see help(summary.FRBpca)):
summary(Spcares)
# ask for the eigenvalues:
Spcares$eigval
# or, in more pretty format, with confidence limits:
summary(Spcares)$eigvals
# note that the standard deviations of the print-output can also be asked for by:
sqrt( summary(Spcares)$eigvals )
# the eigenvectors and their standard errors:
Spcares$eigvec # or prettier: summary(MMpcares)$eigvecs
Spcares$eigvec.SE
# take a look at the bootstrap distribution of the first eigenvalue
hist(Spcares$eigval.boot[1,])
# that bootstrap distribution is used to compute confidence limits as depicted
# by the screeplot function:
plotFRBvars(Spcares, cumul=0)
# all plots for the FRB-PCA result:
plot(Spcares)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.