| CauRuimet | R Documentation | 
Gives a robust estimate of an unknown within group covariance, aiming either to look for dense groups or to sparse groups (outliers) according to local variance and weighting function choice.
 CauRuimet(Z,ker=1,m0=1,withingroup=TRUE,
              loc=substitute(apply(Z,2,mean,trim=.1)),matrixmethod=TRUE, Nrandom=3000)
        | Z | matrix | 
| ker | either numerical or a function:
if numerical the weighting function is  | 
| m0 | is a graph of neighbourhood or another proximity matrix, the hadamard product of the proximities will be operated | 
| withingroup | logical,if  | 
| loc | a vector of locations or a function using mean, median, to give an estimate of it | 
| matrixmethod | if  | 
| Nrandom | if  | 
When withingroup is TRUE, local(defined by the weighting) variance formula is returned, aiming
at finding dense groups: 
W_l=\frac{\sum_{i=1}^{n-1}\sum_{j=i+1}^n
m0_{ij}ker(d^2_{S^-}(Z_i,Z_j))(Z_i-Z_j)'(Z_i-Z_j)}{\sum_{i=1}^{n-1}\sum_{j=i+1}^n
m0_{ij}ker(d^2_{S^-}(Z_i,Z_j))}
 where d^2_{S^-}( . , .) is the squared euclidian distance with
S^- the inverse of a robust sample covariance (i.e. using loc instead of the mean) ;
if FALSE robust Total weighted variance or if m0 not 1  Global weighted variance, is returned:
W_o=\frac{\sum_{i=1}^nker(d^2_{S^-}(Z_i,\tilde{Z}))(Z_i-\tilde{Z})'(Z_i-\tilde{Z})}
 {\sum_{i=1}^n  ker(d^2_{S^-}(Z_i,\tilde{Z}))}
W_g=\frac{\sum_{i=1}^{n-1}\sum_{j=i+1}^n
m0_{ij}.ker(d^2_{S^-}(Z_i,Z_j))(Z_i-\tilde{Z})'(Z_j-\tilde{Z})}
 {\sum_{i=1}^{n-1}\sum_{j=i+1}^n
m0_{ij}ker(d^2_{S^-}(Z_i,Z_j))}
where \tilde{Z} is the vector loc.
If m0 is a graph of neighbourhood and ker is the function returning 1 (no proximity due to
distance is used) the function will return (when withingroup=TRUE) the local
variance-covariance matrix as define in Lebart(1969). 
a matrix
As mentioned by Caussinus and Ruiz a good strategy to reveal dense groups with generalised PCA
would be to reveal outliers first using the metric W_o^{-1} and remove them before using the
metric W_l^{-1}. Based on theoretical considerations they recommand for  the choice of
ker, with the decreasing function e^{(-ker \;t)}: a lower bound of 1 if
withingroup and something fairly small say in the interval [0.05;0.3] otherwise.
Didier G. Leibovici
Caussinus, H and Ruiz, A (1990) Interesting Projections of Multidimensional Data by Means of Generalized Principal Components Analysis. COMPSTAT90, Physica-Verlag, Heidelberg,121-126.
Faraj, A (1994) Interpretation tools for Generalized Discriminant Analysis.In: New Approches in Classification and Data Analysis, Springer-Verlag, 286-291, Heidelberg.
Lebart, L (1969) Analyse statistique de la contiguit<e9>e.Publication de l'Institut de Statistiques Universitaire de Paris, XVIII,81-112.
Leibovici D (2008) Spatio-temporal Multiway Decomposition using Principal Tensor Analysis on k-modes: the R package PTAk . to be submitted soon at Journal of Statisticcal Software.
SVDgen
 data(iris)
  iris2 <- as.matrix(iris[,1:4])
  dimnames(iris2)[[1]] <- as.character(iris[,5])
 D2 <- CauRuimet(iris2,ker=1,withingroup=TRUE)
 D2 <- Powmat(D2,(-1))
 iris2 <- sweep(iris2,2,apply(iris2,2,mean))
 res <- SVDgen(iris2,D2=D2,D1=1)
 plot(res,nb1=1,nb2=2,cex=1,mod=1,Zcol=list(c(rep(1,50),rep(2,50),rep(3,50))))
 summary(res,testvar=0)
 # the same in a demo function
 # source(paste(R.home(),"/library/PTAk/demo/CauRuimet.R",sep=""))
 # demo.CauRuimet(ker=4,withingroup=TRUE,openX11s=FALSE)
 # demo.Cauruimet(ker=0.15,withingroup=FALSE,openX11s=FALSE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.