ICS-S3 | R Documentation |
Transforms the data via two scatter matrices to an invariant coordinate
system or independent components, depending on the underlying assumptions.
Function ICS()
is intended as a replacement for ics()
and ics2()
, and it combines their functionality into a single
function. Importantly, the results are returned as an
S3
object rather than an
S4
object. Furthermore, ICS()
implements recent improvements, such as a numerically stable algorithm based
on the QR algorithm for a common family of scatter pairs.
ICS(
X,
S1 = ICS_cov,
S2 = ICS_cov4,
S1_args = list(),
S2_args = list(),
algorithm = c("whiten", "standard", "QR"),
center = FALSE,
fix_signs = c("scores", "W"),
na.action = na.fail
)
X |
a numeric matrix or data frame containing the data to be transformed. |
S1 |
a numeric matrix containing the first scatter matrix, an object
of class |
S2 |
a numeric matrix containing the second scatter matrix, an object
of class |
S1_args |
a list containing additional arguments for |
S2_args |
a list containing additional arguments for |
algorithm |
a character string specifying with which algorithm
the invariant coordinate system is computed. Possible values are
|
center |
a logical indicating whether the invariant coordinates should
be centered with respect to first locattion or not (default to |
fix_signs |
a character string specifying how to fix the signs of the
invariant coordinates. Possible values are |
na.action |
a function to handle missing values in the data (default
to |
For a given scatter pair S_{1}
and S_{2}
, a matrix
Z
(in which the columns contain the scores of the respective invariant
coordinates) and a matrix W
(in which the rows contain the
coefficients of the linear transformation to the respective invariant
coordinates) are found such that:
The columns of Z
are whitened with respect to
S_{1}
. That is, S_{1}(Z) = I
, where I
denotes the identity matrix.
The columns of Z
are uncorrelated with respect to
S_{2}
. That is, S_{2}(Z) = D
, where D
is a diagonal matrix.
The columns of Z
are ordered according to their generalized
kurtosis.
Given those criteria, W
is unique up to sign changes in its rows. The
argument fix_signs
provides two ways to ensure uniqueness of W
:
If argument fix_signs
is set to "scores"
, the signs
in W
are fixed such that the generalized skewness values of all
components are positive. If S1
and S2
provide location
components, which are denoted by T_{1}
and T_{2}
,
the generalized skewness values are computed as
T_{1}(Z) - T_{2}(Z)
.
Otherwise, the skewness is computed by subtracting the column medians of
Z
from the corresponding column means so that all components are
right-skewed. This way of fixing the signs is preferred in an invariant
coordinate selection framework.
If argument fix_signs
is set to "W"
, the signs in
W
are fixed independently of Z
such that the maximum element
in each row of W
is positive and that each row has norm 1. This is
the usual way of fixing the signs in an independent component analysis
framework.
In principal, the order of S_{1}
and S_{2}
does not
matter if both are true scatter matrices. Changing their order will just
reverse the order of the components and invert the corresponding
generalized kurtosis values.
The same does not hold when at least one of them is a shape matrix rather than a true scatter matrix. In that case, changing their order will also reverse the order of the components, but the ratio of the generalized kurtosis values is no longer 1 but only a constant. This is due to the fact that when shape matrices are used, the generalized kurtosis values are only relative ones.
Different algorithms are available to compute the invariant coordinate
system of a data frame X_n
with n
observations:
"whiten": whitens the data X_n
with respect to the first
scatter matrix before computing the second scatter matrix. If S2
is not a function, whitening is not applicable.
whiten the data X_n
with respect to the first
scatter matrix: Y_n = X_n S_1(X_n)^{-1/2}
compute S_2
for the uncorrelated data: S_2(Y_n)
perform the eigendecomposition of S_2(Y_n)
: S_2(Y_n) = UDU'
compute W
: W = U' S_1(X_n)^{-1/2}
"standard": performs the spectral decomposition of the
symmetric matrix M(X_n)
compute M(X_n) = S_1(X_n)^{-1/2} S_2(X_n) S_1(X_n)^{-1/2}
perform the eigendecomposition of M(X_n)
: M(X_n) = UDU'
compute W
: W = U' S_1(X_n)^{-1/2}
"QR": numerically stable algorithm based on the QR algorithm for a
common family of scatter pairs: if S1
is ICS_cov()
or cov()
, and if S2
is one of
ICS_cov4()
, ICS_covW()
, ICS_covAxis()
, cov4()
,
covW()
, or covAxis()
.
For other scatter pairs, the QR algorithm is not
applicable. See Archimbaud et al. (2023)
for details.
The "whiten" algorithm is the most natural version and therefore the default. The option "standard" should be only used if the scatters provided are not functions but precomputed matrices. The option "QR" is mainly of interest when there are numerical issues when "whiten" is used and the scatter combination allows its usage.
Note that when the purpose of ICS is outlier detection the package ICSOutlier
provides additional functionalities as does the package ICSClust
in case the
goal of ICS is dimension reduction prior clustering.
An object of class "ICS"
with the following components:
gen_kurtosis |
a numeric vector containing the generalized kurtosis values of the invariant coordinates. |
W |
a numeric matrix in which each row contains the coefficients of the linear transformation to the corresponding invariant coordinate. |
scores |
a numeric matrix in which each column contains the scores of the corresponding invariant coordinate. |
gen_skewness |
a numeric vector containing the (generalized) skewness
values of the invariant coordinates (only returned if
|
S1_label |
a character string providing a label for the first scatter matrix to be used by various methods. |
S2_label |
a character string providing a label for the second scatter matrix to be used by various methods. |
S1_args |
a list containing additional arguments used to compute
|
S2_args |
a list containing additional arguments used to compute
|
algorithm |
a character string specifying how the invariant coordinate is computed. |
center |
a logical indicating whether or not the data were centered with respect to the first location vector before computing the invariant coordinates. |
fix_signs |
a character string specifying how the signs of the invariant coordinates were fixed. |
Andreas Alfons and Aurore Archimbaud, based on code for
ics()
and ics2()
by Klaus Nordhausen
Tyler, D.E., Critchley, F., Duembgen, L. and Oja, H. (2009) Invariant Co-ordinate Selection. Journal of the Royal Statistical Society, Series B, 71(3), 549–592. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1111/j.1467-9868.2009.00706.x")}.
Archimbaud, A., Drmac, Z., Nordhausen, K., Radojcic, U. and Ruiz-Gazen, A. (2023) Numerical Considerations and a New Implementation for Invariant Coordinate Selection. SIAM Journal on Mathematics of Data Science, 5(1), 97–121. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1137/22M1498759")}.
gen_kurtosis()
, coef()
,
components()
, fitted()
, and
plot()
methods
# import data
data("iris")
X <- iris[,-5]
# run ICS
out_ICS <- ICS(X)
out_ICS
summary(out_ICS)
# extract generalized eigenvalues
gen_kurtosis(out_ICS)
# Plot
screeplot(out_ICS)
# extract the components
components(out_ICS)
components(out_ICS, select = 1:2)
# Plot
plot(out_ICS)
# equivalence with previous functions
out_ics <- ics(X, S1 = cov, S2 = cov4, stdKurt = FALSE)
out_ics
out_ics2 <- ics2(X, S1 = MeanCov, S2 = Mean3Cov4)
out_ics2
out_ICS
# example using two functions
X1 <- rmvnorm(250, rep(0,8), diag(c(rep(1,6),0.04,0.04)))
X2 <- rmvnorm(50, c(rep(0,6),2,0), diag(c(rep(1,6),0.04,0.04)))
X3 <- rmvnorm(200, c(rep(0,7),2), diag(c(rep(1,6),0.04,0.04)))
X.comps <- rbind(X1,X2,X3)
A <- matrix(rnorm(64),nrow=8)
X <- X.comps %*% t(A)
ics.X.1 <- ICS(X)
summary(ics.X.1)
plot(ics.X.1)
# compare to
pairs(X)
pairs(princomp(X,cor=TRUE)$scores)
# slow:
if (require("ICSNP")) {
ics.X.2 <- ICS(X, S1 = tyler.shape, S2 = duembgen.shape,
S1_args = list(location=0))
summary(ics.X.2)
plot(ics.X.2)
# example using three pictures
library(pixmap)
fig1 <- read.pnm(system.file("pictures/cat.pgm", package = "ICS")[1],
cellres = 1)
fig2 <- read.pnm(system.file("pictures/road.pgm", package = "ICS")[1],
cellres = 1)
fig3 <- read.pnm(system.file("pictures/sheep.pgm", package = "ICS")[1],
cellres = 1)
p <- dim(fig1@grey)[2]
fig1.v <- as.vector(fig1@grey)
fig2.v <- as.vector(fig2@grey)
fig3.v <- as.vector(fig3@grey)
X <- cbind(fig1.v, fig2.v, fig3.v)
A <- matrix(rnorm(9), ncol = 3)
X.mixed <- X %*% t(A)
ICA.fig <- ICS(X.mixed)
par.old <- par()
par(mfrow=c(3,3), omi = c(0.1,0.1,0.1,0.1), mai = c(0.1,0.1,0.1,0.1))
plot(fig1)
plot(fig2)
plot(fig3)
plot(pixmapGrey(X.mixed[,1], ncol = p, cellres = 1))
plot(pixmapGrey(X.mixed[,2], ncol = p, cellres = 1))
plot(pixmapGrey(X.mixed[,3], ncol = p, cellres = 1))
plot(pixmapGrey(components(ICA.fig)[,1], ncol = p, cellres = 1))
plot(pixmapGrey(components(ICA.fig)[,2], ncol = p, cellres = 1))
plot(pixmapGrey(components(ICA.fig)[,3], ncol = p, cellres = 1))
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.