Test Data Generation for Sparse PCA examples

Share:

Description

Draws a sample data set, as introduced by Zou et al. (2006).

Usage

1
data.Zou (n = 250, p =  c(4, 4, 2), ...)

Arguments

n

The required number of observations.

p

A vector of length 3, specifying how many variables shall be constructed using the three factors V1, V2 and V3.

...

Further arguments passed to or from other functions.

Details

This data set has been introduced by Zou et al. (2006), and then been referred to several times, e.g. by Farcomeni (2009), Guo et al. (2010) and Croux et al. (2011).

The data set contains two latent factors V1 ~ N(0, 290) and V2 ~ N(0, 300) and a third mixed component V3 = -0.3 V1 + 0.925V2 + e; e ~ N(0, 1).
The ten variables Xi of the original data set are constructed the following way:
Xi = V1 + ei; i = 1, 2, 3, 4
Xi = V2 + ei; i = 5, 6, 7, 8
Xi = V3 + ei; i = 9, 10
whereas ei ~ N(0, 1) is indepependent for i = 1 , ..., 10

Value

A matrix of dimension n x sum (p) containing the generated sample data set.

Author(s)

Heinrich Fritz, Peter Filzmoser <P.Filzmoser@tuwien.ac.at>

References

C. Croux, P. Filzmoser, H. Fritz (2011). Robust Sparse Principal Component Analysis Based on Projection-Pursuit, ?? To appear.

A. Farcomeni (2009). An exact approach to sparse principal component analysis, Computational Statistics, Vol. 24(4), pp. 583-604.

J. Guo, G. James, E. Levina, F. Michailidis, and J. Zhu (2010). Principal component analysis with sparse fused loadings, Journal of Computational and Graphical Statistics. To appear.

H. Zou, T. Hastie, R. Tibshirani (2006). Sparse principal component analysis, Journal of Computational and Graphical Statistics, Vol. 15(2), pp. 265-286.

See Also

sPCAgrid, princomp

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
                   ##  data generation
  set.seed (0)
  x <- data.Zou ()

                   ##  applying PCA
  pc <-  princomp (x)
                   ##  the corresponding non-sparse loadings
  unclass (pc$load[,1:3])
  pc$sdev[1:3]

                   ##  lambda as calculated in the opt.TPO - example
  lambda <- c (0.23, 0.34, 0.005)
                   ##  applying sparse PCA
  spc <- sPCAgrid (x, k = 3, lambda = lambda, method = "sd")
  unclass (spc$load)
  spc$sdev[1:3]

                   ## comparing the non-sparse and sparse biplot
  par (mfrow = 1:2)
  biplot (pc, main = "non-sparse PCs")
  biplot (spc, main = "sparse PCs")