i_pca: Incremental Principal Component Analysis (PCA)

Description Usage Arguments Value References See Also Examples

View source: R/i_pca.r

Description

This function computes the Principal Component Analysis (PCA) solution on the covariance matrix using the incremental method of Hall, Marshall & Martin (2002).

Usage

1
i_pca(data1, data2, current_rank, nchunk = 2, disk = FALSE)

Arguments

data1

Matrix or data frame of starting data, or full data if data2 = NULL

data2

Matrix or data frame of incoming data; omitted when full data is given in data1

current_rank

Rank of approximation or number of components to compute; if empty, the full rank is used

nchunk

Number of incoming data chunks (equal splits of 'data2', default = 2) or a Vector with the row size of each incoming data chunk

disk

Logical indicating whether then output is saved to hard disk

Value

rowpcoord

Row scores on the principal components

colpcoord

Variable loadings

eg

A list describing the eigenspace of a data matrix, with components
u Left eigenvectors
v Right eigenvectors
m Number of cases
d Eigenvalues
orgn Data mean

sv

Singular values

inertia_e

Percentage of explained variance

levelnames

Attribute labels

rowctr

Row contributions

colctr

Column contributions

rowcor

Row squared correlations

colcor

Column squared correlations

nchunk

A copy of nchunk in the return object

disk

A copy of disk in the return object

allrowcoord

A list containing the row scores on the principal components produced after each data chunk is analyzed; returned only when disk = FALSE

allcolcoord

A list containing the variable loadings on the principal components produced after each data chunk is analyzed; returned only when disk = FALSE

allrowctr

A list containing the row contributions after each data chunk is analyzed; returned only when disk = FALSE

allcolctr

A list containing the column contributions after each data chunk is analyzed; returned only when disk = FALSE

allrowcor

A list containing the row squared correlations produced after each data chunk is analyzed; returned only when disk = FALSE

allcolcor

A list containing the column squared correlations produced after each data chunk is analyzed; returned only when disk = FALSE

References

Hall, P., Marshall, D., & Martin, R. (2002). Adding and subtracting eigenspaces with eigenvalue decomposition and singular value decomposition. Image and Vision Computing, 20(13), 1009-1016.

Iodice D' Enza, A., & Markos, A. (2015). Low-dimensional tracking of association structures in categorical data, Statistics and Computing, 25(5), 1009–1022.

Iodice D'Enza, A., Markos, A., & Buttarazzi, D. (2018). The idm Package: Incremental Decomposition Methods in R. Journal of Statistical Software, Code Snippets, 86(4), 1–24. DOI: 10.18637/jss.v086.c04.

See Also

update.i_pca, i_mca, update.i_mca, add_es

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
data("segmentationData", package = "caret")
#center and standardize variables, keep 58 continuous attributes
HCS = data.frame(scale(segmentationData[,-c(1:3)]))
#abbreviate variable names for plotting
names(HCS) = abbreviate(names(HCS), minlength = 5)
#split the data into starting data and incoming data
data1 = HCS[1:150, ]
data2 = HCS[151:2019, ]
#Incremental PCA on the HCS data set: the incoming data is 
#splitted into twenty chunks; the first 5 components/dimensions 
#are computed in each update
res_iPCA = i_pca(data1, data2, current_rank = 5, nchunk = 20)
#Static plots 
plot(res_iPCA, animation = FALSE)

#\donttest is used here because the code calls the saveLatex function of the animation package 
#which requires ImageMagick or GraphicsMagick and 
#Adobe Acrobat Reader to be installed in your system 
#See help(im.convert) for details on the configuration of ImageMagick or GraphicsMagick.
#Creates animated plot in PDF for objects and variables
plot(res_iPCA, animation = TRUE, frames = 10, movie_format = 'pdf')


#Daily Closing Prices of Major European Stock Indices, 1991-1998 
data("EuStockMarkets", package = "datasets") 
res_iPCA = i_pca(data1 = EuStockMarkets[1:50,], data2 = EuStockMarkets[51:1860,], nchunk = 5) 

#\donttest is used here because the code calls the saveLatex function of the animation package 
#which requires ImageMagick or GraphicsMagick and 
#Adobe Acrobat Reader to be installed in your system 
#See help(im.convert) for details on the configuration of ImageMagick or GraphicsMagick.
#Creates animated plot in PDF movies for objects and variables
plot(res_iPCA, animation = TRUE, frames = 10, movie_format = 'pdf')

idm documentation built on May 2, 2019, 9:20 a.m.

Related to i_pca in idm...