# PCAgrid: (Sparse) Robust Principal Components using the Grid search... In pcaPP: Robust PCA by Projection Pursuit

 PCAgrid R Documentation

## (Sparse) Robust Principal Components using the Grid search algorithm

### Description

Computes a desired number of (sparse) (robust) principal components using the grid search algorithm in the plane. The global optimum of the objective function is searched in planes, not in the p-dimensional space, using regular grids in these planes.

### Usage

```PCAgrid (x, k = 2, method = c ("mad", "sd", "qn"),
maxiter = 10, splitcircle = 25, scores = TRUE, zero.tol = 1e-16,
center = l1median, scale, trace = 0, store.call = TRUE, control, ...)

sPCAgrid (x, k = 2, method = c ("mad", "sd", "qn"), lambda = 1,
maxiter = 10, splitcircle = 25, scores = TRUE, zero.tol = 1e-16,
center = l1median, scale, trace = 0, store.call = TRUE, control, ...)
```

### Arguments

 `x` a numerical matrix or data frame of dimension (`n x p`)which provides the data for the principal components analysis. `k` the desired number of components to compute `method` the scale estimator used to detect the direction with the largest variance. Possible values are `"sd"`, `"mad"` and `"qn"`, the latter can be called `"Qn"` too. `"mad"` is the default value. `lambda` the sparseness constraint's strength(`sPCAgrid` only). A single value for all components, or a vector of length `k` with different values for each component can be specified. See `opt.TPO` for the choice of this argument. `maxiter` the maximum number of iterations. `splitcircle` the number of directions in which the algorithm should search for the largest variance. The direction with the largest variance is searched for in the directions defined by a number of equally spaced points on the unit circle. This argument determines, how many such points are used to split the unit circle. `scores` A logical value indicating whether the scores of the principal component should be calculated. `zero.tol` the zero tolerance used internally for checking convergence, etc.
 `center` this argument indicates how the data is to be centered. It can be a function like `mean` or `median` or a vector of length `ncol(x)` containing the center value of each column. `scale` this argument indicates how the data is to be rescaled. It can be a function like `sd` or `mad` or a vector of length `ncol(x)` containing the scale value of each column. `trace` an integer value >= 0, specifying the tracing level.
 `store.call` a logical variable, specifying whether the function call shall be stored in the result structure. `control` a list which elements must be the same as (or a subset of) the parameters above. If the control object is supplied, the parameters from it will be used and any other given parameters are overridden. `...` further arguments passed to or from other functions.

### Details

In contrast to `PCAgrid`, the function `sPCAgrid` computes sparse principal components. The strength of the applied sparseness constraint is specified by argument `lambda`.

Similar to the function `princomp`, there is a `print` method for the these objects that prints the results in a nice format and the `plot` method produces a scree plot (`screeplot`). There is also a `biplot` method.

Angle halving is an extension of the original algorithm. In the original algorithm, the search directions are determined by a number of points on the unit circle in the interval [-pi/2 ; pi/2). Angle halving means this angle is halved in each iteration, eg. for the first approximation, the above mentioned angle is used, for the second approximation, the angle is halved to [-pi/4 ; pi/4) and so on. This usually gives better results with less iterations needed.
NOTE: in previous implementations angle halving could be suppressed by the former argument "`anglehalving`". This still can be done by setting argument `maxiter = 0`.

### Value

The function returns an object of class `"princomp"`, i.e. a list similar to the output of the function `princomp`.

 `sdev` the (robust) standard deviations of the principal components. `loadings` the matrix of variable loadings (i.e., a matrix whose columns contain the eigenvectors). This is of class `"loadings"`: see `loadings` for its `print` method. `center` the means that were subtracted. `scale` the scalings applied to each variable. `n.obs` the number of observations. `scores` if `scores = TRUE`, the scores of the supplied data on the principal components. `call` the matched call. `obj` A vector containing the objective functions values. For function `PCAgrid` this is the same as `sdev`. `lambda` The lambda each component has been calculated with (`sPCAgrid` only).

### Note

See the vignette "Compiling pcaPP for Matlab" which comes with this package to compile and use these functions in Matlab.

### Author(s)

Heinrich Fritz, Peter Filzmoser <P.Filzmoser@tuwien.ac.at>

### References

C. Croux, P. Filzmoser, M. Oliveira, (2007). Algorithms for Projection-Pursuit Robust Principal Component Analysis, Chemometrics and Intelligent Laboratory Systems, Vol. 87, pp. 218-225.

C. Croux, P. Filzmoser, H. Fritz (2011). Robust Sparse Principal Component Analysis Based on Projection-Pursuit, ?? To appear.

`PCAproj`, `princomp`

### Examples

```  # multivariate data with outliers
library(mvtnorm)
x <- rbind(rmvnorm(200, rep(0, 6), diag(c(5, rep(1,5)))),
rmvnorm( 15, c(0, rep(20, 5)), diag(rep(1, 6))))
# Here we calculate the principal components with PCAgrid
pc <- PCAgrid(x)
# we could draw a biplot too:
biplot(pc)
# now we want to compare the results with the non-robust principal components
pc <- princomp(x)
# again, a biplot for comparison:
biplot(pc)

set.seed (0)
x <- data.Zou ()

##  applying PCA
pc <-  princomp (x)
pc\$sdev[1:3]

##  lambda as calculated in the opt.TPO - example
lambda <- c (0.23, 0.34, 0.005)
##  applying sparse PCA
spc <- sPCAgrid (x, k = 3, lambda = lambda, method = "sd")