| selangle | R Documentation | 
The function helps selecting the dimension (i.e. nb. components) of a PCA or PLS by bootstrapping the observations and exploring the stability of the loading matrix P. Stability is quantified by angles between the boostrapped matrices. 
The general idea was proposed by Ye & Weiss 2003 for the sliced inverse regression, and applied to PCA by Luo & Li 2016. The loading matrix P (with a total number of A columns, i.e. loading vectors) is computed on the raw matrix X. Then, a non parametric bootstrap is implemented on the rows of matrix X, and the loading matrices P(b)  b = 1,...,B are calculated for each bootstrap replication b, all with A columns. 
For a given model dimension a <= A, an "angle" is then calculated between the raw matrix P and each matrix P(b), all with considering only the  first a columns. The stability indicator for a matrix  P with a vectors is the mean of the B angles. 
Higher is the mean angle (meaning that the compared matrices do not span the same space), lower is the stability of matrix P whose some last columns were probably with large uncertainty. 
Two measures of angle are proposed, depending on argument angle
1) Default: The "maxsub" angle (See Krzanowski, 1979, Hubert et al 2005, and Engelen et al. 2005).
2) The vector correlation coefficient "q" (Hotelling 1936) used by Ye & Weiss 2003 and Luo & Li 2016).
Print function rnirs::.corvec for the formulas.
Angles are first computed in radians (the right angle = pi / 2), and then divided by pi / 2 to vary between 0 and 1 (1 = minimal stability). 
Jumps in the curve of the mean angle, followed by regular patterns are also informative.
selangle(
    X, Y = NULL, ncomp = NULL, algo = NULL, 
    B = 50, seed = NULL,
    angle = c("maxsub", "hot"),
    plot = TRUE, 
    xlab = "Nb. components", ylab = NULL,
    print = TRUE, 
    ...
    )
| X | A  | 
| Y | For PLS, a  | 
| ncomp | The maximal number of PCA or PLS scores (= components = latent variables) to be calculated. | 
| algo |  For  | 
| B | Number of bootstrap replications. | 
| seed | An integer defining the seed for the random simulation, or  | 
| angle | Type of angle. Possible values are "maxsub" (default) or "hot" (q of Hotelling). | 
| plot | Logical. If  | 
| xlab | Label for the x-axis of the plot. | 
| ylab | Label for the y-axis of the plot. | 
| print | Logical. If  | 
| ... | Optionnal arguments to pass in the function defined in  | 
A list with output r = vector of the standardized angle.
Engelen, S., Hubert, M., Branden, K.V., 2005. A Comparison of Three Procedures for Robust PCA in High Dimensions. Austrian Journal of Statistics 34, 117-126-117-126. https://doi.org/10.17713/ajs.v34i2.405
Hubert, M., Rousseeuw, P.J., Vanden Branden, K., 2005. ROBPCA: A New Approach to Robust Principal Component Analysis. Technometrics 47, 64-79. https://doi.org/10.1198/004017004000000563
Krzanowski, W.J., 1979. Between-Groups Comparison of Principal Components. Journal of the American Statistical Association 74, 703-707. https://doi.org/10.1080/01621459.1979.10481674
Luo, W., Li, B., 2016. Combining eigenvalues and variation of eigenvectors for order determination. Biometrika 103, 875-887. https://doi.org/10.1093/biomet/asw051
Ye, Z., Weiss, R.E., 2003. Using the Bootstrap to Select One of a New Class of Dimension Reduction Methods. Jasa 98, 968-979. https://doi.org/10.1198/016214503000000927
data(datcass)
Xr <- datcass$Xr
yr <- datcass$yr
ncomp <- 30
selangle(Xr, yr, ncomp = ncomp, B = 10)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.