# i_pca: Incremental Principal Component Analysis (PCA) In idm: Incremental Decomposition Methods

 i_pca R Documentation

## Incremental Principal Component Analysis (PCA)

### Description

This function computes the Principal Component Analysis (PCA) solution on the covariance matrix using the incremental method of Hall, Marshall & Martin (2002).

### Usage

```i_pca(data1, data2, current_rank, nchunk = 2, disk = FALSE)
```

### Arguments

 `data1` Matrix or data frame of starting data, or full data if data2 = NULL `data2` Matrix or data frame of incoming data; omitted when full data is given in data1 `current_rank` Rank of approximation or number of components to compute; if empty, the full rank is used `nchunk` Number of incoming data chunks (equal splits of 'data2', `default = 2`) or a Vector with the row size of each incoming data chunk `disk` Logical indicating whether then output is saved to hard disk

### Value

 `rowpcoord` Row scores on the principal components `colpcoord` Variable loadings `eg` A list describing the eigenspace of a data matrix, with components `u` Left eigenvectors `v` Right eigenvectors `m` Number of cases `d` Eigenvalues `orgn` Data mean `sv` Singular values `inertia_e` Percentage of explained variance `levelnames` Attribute labels `rowctr` Row contributions `colctr` Column contributions `rowcor` Row squared correlations `colcor` Column squared correlations `nchunk` A copy of `nchunk` in the return object `disk` A copy of `disk` in the return object `allrowcoord` A list containing the row scores on the principal components produced after each data chunk is analyzed; returned only when `disk = FALSE` `allcolcoord` A list containing the variable loadings on the principal components produced after each data chunk is analyzed; returned only when `disk = FALSE` `allrowctr` A list containing the row contributions after each data chunk is analyzed; returned only when `disk = FALSE` `allcolctr` A list containing the column contributions after each data chunk is analyzed; returned only when `disk = FALSE` `allrowcor` A list containing the row squared correlations produced after each data chunk is analyzed; returned only when `disk = FALSE` `allcolcor` A list containing the column squared correlations produced after each data chunk is analyzed; returned only when `disk = FALSE`

### References

Hall, P., Marshall, D., & Martin, R. (2002). Adding and subtracting eigenspaces with eigenvalue decomposition and singular value decomposition. Image and Vision Computing, 20(13), 1009-1016.

Iodice D' Enza, A., & Markos, A. (2015). Low-dimensional tracking of association structures in categorical data, Statistics and Computing, 25(5), 1009–1022.

Iodice D'Enza, A., Markos, A., & Buttarazzi, D. (2018). The idm Package: Incremental Decomposition Methods in R. Journal of Statistical Software, Code Snippets, 86(4), 1–24. DOI: 10.18637/jss.v086.c04.

`update.i_pca`, `i_mca`, `update.i_mca`, `add_es`

### Examples

```data("segmentationData", package = "caret")
#center and standardize variables, keep 58 continuous attributes
HCS = data.frame(scale(segmentationData[,-c(1:3)]))
#abbreviate variable names for plotting
names(HCS) = abbreviate(names(HCS), minlength = 5)
#split the data into starting data and incoming data
data1 = HCS[1:150, ]
data2 = HCS[151:2019, ]
#Incremental PCA on the HCS data set: the incoming data is
#splitted into twenty chunks; the first 5 components/dimensions
#are computed in each update
res_iPCA = i_pca(data1, data2, current_rank = 5, nchunk = 20)
#Static plots
plot(res_iPCA, animation = FALSE)

#\donttest is used here because the code calls the saveLatex function of the animation package
#which requires ImageMagick or GraphicsMagick and
#See help(im.convert) for details on the configuration of ImageMagick or GraphicsMagick.
#Creates animated plot in PDF for objects and variables
plot(res_iPCA, animation = TRUE, frames = 10, movie_format = 'pdf')

#Daily Closing Prices of Major European Stock Indices, 1991-1998
data("EuStockMarkets", package = "datasets")
res_iPCA = i_pca(data1 = EuStockMarkets[1:50,], data2 = EuStockMarkets[51:1860,], nchunk = 5)

#\donttest is used here because the code calls the saveLatex function of the animation package
#which requires ImageMagick or GraphicsMagick and