Compare nipals/empca with no missing Compare nipals/empca with missing, equal weight Compare nipals/empca with missing, unequal weight
# Python - Coeff (scores) [[-2.809 0.097 0.244 0.050] [-1.834 0.286 0.010 -0.135] [-0.809 0.963 -0.341 0.078] [-0.155 -1.129 0.548 0.026] [0.707 -0.723 -0.736 -0.024] [1.830 -0.290 -0.157 0.030] [3.070 0.796 0.431 -0.026]] # m1e <- empca(x=B1, w=B1wt, ncomp=4) # Un-sweep the eigenvalues to compare to python results # R round( sweep( m1e$scores, 2, m1e$eig, "*"), 3) PC1 PC2 PC3 PC4 G1 -2.809 0.097 -0.244 0.050 G2 -1.834 0.286 -0.010 -0.135 G3 -0.809 0.963 0.341 0.078 G4 -0.155 -1.129 -0.548 0.026 G5 0.707 -0.723 0.736 -0.024 G6 1.830 -0.290 0.157 0.030 G7 3.070 0.796 -0.431 -0.026 # Matlab - P (scores) 0.5590 0.0517 0.2210 0.2910 0.3650 0.1520 0.0095 -0.7840 0.1610 0.5120 -0.3080 0.4530 0.0309 -0.6010 0.4950 0.1510 -0.1410 -0.3850 -0.6640 -0.1380 -0.3650 -0.1540 -0.1420 0.1760 -0.6110 0.4230 0.3890 -0.1490 # R round(m1e$scores, 3) PC1 PC2 PC3 PC4 G1 -0.559 -0.052 0.221 -0.291 G2 -0.365 -0.152 0.009 0.784 G3 -0.161 -0.512 -0.308 -0.453 G4 -0.031 0.601 0.495 -0.151 G5 0.141 0.385 -0.664 0.138 G6 0.365 0.154 -0.142 -0.176 G7 0.611 -0.423 0.389 0.149
# Python with initial Identity matrix [[2.791 0.125 0.325 -0.035] [1.528 -0.989 -0.211 0.172] [0.990 -0.651 -0.117 -0.186] [0.159 1.463 0.530 0.020] [-0.628 0.862 -0.730 -0.032] [-1.738 0.406 -0.139 -0.071] [-2.917 -0.712 0.520 -0.047]] Eigvec (loadings) [[-0.309 -0.839 -0.298 0.300] [-0.502 0.014 0.154 -0.615] [-0.470 -0.086 0.766 0.219] [-0.441 0.521 -0.236 0.615] [-0.487 0.128 -0.496 -0.326]] # R R> m2e <- empca(x=B2, w=B2wt, ncomp=4, seed=NULL) # # Un-sweep the eigenvalues to compare to python results R> round( sweep( m2e$scores, 2, m2e$eig, "*"), 3) PC1 PC2 PC3 PC4 G1 -2.791 0.216 -0.356 0.066 G2 -1.528 -0.942 0.187 -0.150 G3 -0.990 -0.620 0.101 0.207 G4 -0.159 1.472 -0.522 -0.032 G5 0.628 0.844 0.744 0.021 G6 1.738 0.351 0.161 0.050 G7 2.917 -0.808 -0.493 0.019 R> round( m2e$loadings, 3) PC1 PC2 PC3 PC4 E1 0.309 -0.839 0.298 -0.300 E2 0.502 0.014 -0.154 0.615 E3 0.470 -0.086 -0.766 -0.219 E4 0.441 0.521 0.236 -0.615 E5 0.487 0.128 0.496 0.326
Python code by @bailey2012principal, retrieved 1 Mar 2019 from https://github.com/sbailey/empca .
The Python code is difficult to read in places for a person [like me] not well-versed with Python. Three examples:
k
takes in for k in range(self.nvec)
.for
loops instead of a function.Model
class structure makes it a bit tricky to figure out what objects have actually been modified inside a function.The Python code iterates these two EM steps:
ncomp
principal components P simultaneously (iterate each to convergence). Orthogonalize P.For a complete-data problem, python and R give similar results. Note the Coeff
matrix in python does NOT have eigenvalues swept out of the columns.
For the missing-data problem, the python results are somewhat different from R.
Matlab code by Vicente Parot, retrieved 1 Mar 2019 from https://www.mathworks.com/matlabcentral/fileexchange/45353-empca.
The Matlab code feels similar to R.
The Matlab code calculates principal components sequentially, one at time. For each principal component, the algorithm iterates between these two steps:
While this is a type of EM algorithm, it is not the algorithm described by Bailey (2012) and is considered further.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.