acpgen: Generalised principal component analysis

View source: R/acprob.R

acpgenR Documentation

Generalised principal component analysis

Description

Generalised principal component analysis

Usage

acpgen(x,h1,h2,center=TRUE,reduce=TRUE,kernel="gaussien")
K(u,kernel="gaussien")
W(x,h,D=NULL,kernel="gaussien")

Arguments

x

Matrix or data frame

h

Scalar: bandwidth of the Kernel

h1

Scalar: bandwidth of the Kernel for W

h2

Scalar: bandwidth of the Kernel for U

kernel

The kernel used. This must be one of '"gaussien"', '"quartic"', '"triweight"', '"epanechikov"' , '"cosinus"' or '"uniform"'

center

A logical value indicating whether we center data

reduce

A logical value indicating whether we "reduce" data i.e. divide each column by standard deviation

D

A product scalar matrix / une matrice de produit scalaire

u

Vector

Details

acpgen compute generalised pca. i.e. spectral analysis of U_n . W_n^{-1}, and project X_i with W_n^{-1} on the principal vector sub-spaces.

X_i a column vector of p variables of individu i (input data)

W compute estimation of noise in the variance.

W_n=\frac{\sum_{i=1}^{n-1}\sum_{j=i+1}^{n}K(||X_i-X_j||_{V_n^{-1}}/h)(X_i-X_j)(X_i-X_j)'}{\sum_{i=1}^{n-1}\sum_{j=i+1}^{n}K(||X_i-X_j||_{V_n^{-1}}/h)}

with V_n variance estimation;

U compute robust variance. U_n^{-1} = S_n^{-1} - 1/h V_n^{-1}

S_n=\frac{\sum_{i=1}^{n}K(||X_i||_{V_n^{-1}}/h)(X_i-\mu_n)(X_i-\mu_n)'}{\sum_{i=1}^nK(||X_i||_{V_n^{-1}}/h)}

with \mu_n estimator of the mean.

K compute kernel, i.e.

gaussien:

\frac{1}{\sqrt{2\pi}} e^{-u^2/2}

quartic:

\frac{15}{16}(1-u^2)^2 I_{|u|\leq 1}

triweight:

\frac{35}{32}(1-u^2)^3 I_{|u|\leq 1}

epanechikov:

\frac{3}{4}(1-u^2) I_{|u|\leq 1}

cosinus:

\frac{\pi}{4}\cos(\frac{\pi}{2}u) I_{|u|\leq 1}

Value

An object of class acp The object is a list with components:

sdev

the standard deviations of the principal components.

loadings

the matrix of variable loadings (i.e., a matrix whose columns contain the eigenvectors). This is of class "loadings": see loadings for its print method.

scores

if scores = TRUE, the scores of the supplied data on the principal components.

eig

Eigen values

Author(s)

Antoine Lucas

References

H. Caussinus, M. Fekri, S. Hakam and A. Ruiz-Gazen, A monitoring display of multivariate outliers Computational Statistics & Data Analysis, Volume 44, Issues 1-2, 28 October 2003, Pages 237-252

Caussinus, H and Ruiz-Gazen, A. (1993): Projection Pursuit and Generalized Principal Component Analyses, in New Directions in Statistical Data Analysis and Robustness (eds. Morgenthaler et al.), pp. 35-46. Birk\"auser Verlag Basel.

Caussinus, H. and Ruiz-Gazen, A. (1995). Metrics for Finding Typical Structures by Means of Principal Component Analysis. In Data Science and its Applications (eds Y. Escoufier and C. Hayashi), pp. 177-192. Tokyo: Academic Press.

Antoine Lucas and Sylvain Jasson, Using amap and ctc Packages for Huge Clustering, R News, 2006, vol 6, issue 5 pages 58-60.

See Also

acp acprob princomp

Examples

data(lubisch)
lubisch <- lubisch[,-c(1,8)]
p <- acpgen(lubisch,h1=1,h2=1/sqrt(2))
plot(p,main='ACP robuste des individus')

# See difference with acp

p <- princomp(lubisch)
class(p)<- "acp"


amap documentation built on Oct. 30, 2024, 9:09 a.m.