options(width=90)

The principal components of $\bf X'X$, where $\bf X$ is a column-centered matrix, can be found by several methods, including SVD and NIPALS.

The SVD (Singular Value Decomposition) of a matrix $X$ is $$ \bf X = U S V', $$ where $\bf U$ $(n \times r)$ and $\bf V$ $(k \times r)$ are orthogonal matrices and $\bf S$ is a diagonal matrix of $r$ singular values.

SVD does not allow missing values in the data.

The NIPALS (Nonlinear Iterative Partial Least Squares) algorithm can be used to find the first few (or all) principal components with the decomposition
$$
\bf X = \bf T \bf P '
$$
where the columns of $\bf T$ are called *scores* and the columns of
$\bf P$ (the rows of $\bf P'$) are called the *loadings*.

The algorithm begins by initializing $h=1$ and $\bf X_h = \bf X$, then proceeds through the following basic steps:

- Choose $\bf t_h$ as any column of $\bf X_h$.
- Compute loadings $\bf p_h = X_h' t_h / t_h' t_h$.
- Let $\bf p_h = p_h / \sqrt{p_h' p_h}$.
- Compute scores $\bf t_h = X_h p_h / p_h' p_h$.

Repeat (3) and (4) until convergence for the $h^{th}$ principal component.

Let $\bf X_{h+1} = \bf X_h - t_h p_h'$. Let $\lambda_h = \bf t_h' t$ (eigen value). Increment $h = h + 1$ and repeat for the next principal component.

Assemble the columns of $\bf T$ from the $\bf t_h$ and the columns of $\bf P$ from the vectors $\bf p_h$.

The resulting PCs may be scaled in different ways. One way to scale the PCA solution is to define the loadings $\bf P = V$ and $\bf T = U'S$.

The NIPALS algorithm can be modified to accommodate missing values using the method of @martens2001multivariate (p. 381).

If, for a certain variable $k$ [column of $\bf X$], a missing value is encountered in $\bf X$ for a certain object $i$ [row of $\bf X$], then the corresponding elements in $\bf t_{ih}$ must also be skipped in the calculation of the loadings, which for $\bf X$-variable $k$ is $$ \bf p_{hk} = X_{k,h-1} t_h' / (t_h' t_h) . $$

Likewise, if, for a certain sample $i$ [row of $\bf X$], a missing value is encountered in $\bf X$ for a certain variable $k$ [column of $\bf X$], then the corresponding elements in $\bf p_{kh}$ must also be skipped in calculating the scores, which for sample $i$ is $$ \bf t_{ih} = X_{i,h-1} p_h / (p_h' p_h) $$ This method may have convergence problems if there are many missing values.

Because of the accumulation of floating-point errors, the orthogonality of the principal components is quickly lost as the number of components increases. @andrecut2009parallel provided a Gram-Schmidt modified version of NIPALS that stabilizes the orthogonality by re-orthogonalizing the scores and loadings at each iteration. The 'corrected' terms are:

$$ \bf p_c = p - P_{1:h} P_{1:h}' p $$ and

$$ \bf t_c = t - T_{1:h} T_{1:h}' t $$ where $\bf P_{1:h}$ and $\bf T_{1:h}$ are the loadings and scores matrices based on the first $h$ principal components. Since $\bf P_{1:h} P_{1:h}'$ only needs to be calculated once for each PC (and incrementally), the orthogonalization is not very computationally expensive.

This correction method is also used by SAS PROC HPPRINCOMP (which does not allow missing values).

A small dataset with two missing values.

require(nipals) B <- matrix(c(50, 67, 90, 98, 120, 55, 71, 93, 102, 129, 65, 76, 95, 105, 134, 50, 80, 102, 130, 138, 60, 82, 97, 135, 151, 65, 89, 106, 137, 153, 75, 95, 117, 133, 155), ncol=5, byrow=TRUE) B2 <- B B2[1,1] <- B2[2,1] <- NA m0 <- svd(scale(B)) # center and scale

require("nipals") m1 <- nipals::nipals(B2, gramschmidt=FALSE) m2 <- nipals::nipals(B2, gramschmidt=TRUE)

Model `m1`

omits the Gram-Schmidt orthogonalization step at each iteration. Model `m2`

includes it.

The eigenvalues for the two models are very similar.

round( m1$eig, 3) round( m2$eig, 3)

In theory, the loadings matrix $\bf P$ is orthogonal so that $\bf P' P = I$. If there are missing values, however, then the calculation of approximate PCs causes numerical errors to accumulate, so that in practice only the first few components can be accurately calculated. (The coordinates of the last PC can often be quite poor.)

In this small example, the first 3 PCs of model `m1`

are fairly orthogonal, but the 4th and 5th PC are not so good. For model `m2`

, the PCs are nearly exactly orthogonal.

# loadings round( crossprod(m1$loadings), 3) # P'P = t(P) %*% P round( crossprod(m2$loadings), 3)

Also in theory, $\bf T' T = I$ (if eigenvalues are removed from T), but missing values again invalidate this identity, unless the Gram-Schmidt method is used.

# scores round( crossprod(m1$scores), 3) # T'T = t(T) %*% T round( crossprod(m2$scores), 3)

**Any scripts or data that you put into this service are public.**

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.