trafos_dimreduction: Dimension-Reduction Transformations for Training or Sampling

trafos_dimreductionR Documentation

Dimension-Reduction Transformations for Training or Sampling

Description

Dimension-reduction transformations applied to an input data matrix. Currently on the principal component transformation and its inverse.

Usage

PCA_trafo(x, mu, Gamma, inverse = FALSE, ...)

Arguments

x

(n, d)-matrix of data (typically before training or after sampling). If inverse = FALSE, then, conceptually, an (n, d)-matrix with 1\le k \le d, where d is the dimension of the original data whose dimension was reduced to k.

mu

if inverse = TRUE, a d-vector of centers, where d is the dimension to transform x to.

Gamma

if inverse = TRUE, a (d, k)-matrix with k at least as large as ncol(x) containing the k orthonormal eigenvectors of a covariance matrix sorted in decreasing order of their eigenvalues; in other words, the columns of Gamma contain principal axes or loadings. If a matrix with k greater than ncol(x) is provided, only the first k-many are considered.

inverse

logical indicating whether the inverse transformation of the principal component transformation is applied.

...

additional arguments passed to the underlying prcomp().

Details

Conceptually, the principal component transformation transforms a vector \bm{X} to a vector \bm{Y} where \bm{Y} = \Gamma^T(\bm{X}-\bm{\mu}), where \bm{\mu} is the mean vector of \bm{X} and \Gamma is the (d, d)-matrix whose columns contains the orthonormal eigenvectors of cov(X).

The corresponding (conceptual) inverse transformation is \bm{X} = \bm{\mu} + \Gamma \bm{Y}.

See McNeil et al. (2015, Section 6.4.5).

Value

If inverse = TRUE, the transformed data whose rows contain \bm{X} = \bm{\mu} + \Gamma \bm{Y}, where Y is one row of x. See the details below for the notation.

If inverse = FALSE, a list containing:

PCs:

(n, d)-matrix of principal components.

cumvar:

cumulative variances; the jth entry provides the fraction of the explained variance of the first j principal components.

sd:

sample standard deviations of the transformed data.

lambda:

eigenvalues of cov(x).

mu:

d-vector of centers of x (see also above) typically provided to PCA_trafo(, inverse = TRUE).

Gamma:

(d, d)-matrix of principal axes (see also above) typically provided to PCA_trafo(, inverse = TRUE).

Author(s)

Marius Hofert

References

McNeil, A. J., Frey, R., and Embrechts, P. (2015). Quantitative Risk Management: Concepts, Techniques, Tools. Princeton University Press.

Examples

library(gnn) # for being standalone

## Generate data
library(copula)
set.seed(271)
X <- qt(rCopula(1000, gumbelCopula(2, dim = 10)), df = 3.5)
pairs(X, gap = 0, pch = ".")

## Principal component transformation
PCA <- PCA_trafo(X)
Y <- PCA$PCs
PCA$cumvar[3] # fraction of variance explained by the first 3 principal components
which.max(PCA$cumvar > 0.9) # number of principal components it takes to explain 90%

## Biplot (plot of the first two principal components = data transformed with
## the first two principal axes)
plot(Y[,1:2])

## Transform back and compare
X. <- PCA_trafo(Y, mu = PCA$mu, Gamma = PCA$Gamma, inverse = TRUE)
stopifnot(all.equal(X., X))

## Note: One typically transforms back with only some of the principal axes
X. <- PCA_trafo(Y[,1:3], mu = PCA$mu, # mu determines the dimension to transform to
                Gamma = PCA$Gamma, # must be of dim. (length(mu), k) for k >= ncol(x)
                inverse = TRUE)
stopifnot(dim(X.) == c(1000, 10))
## Note: We (typically) transform back to the original dimension.
pairs(X., gap = 0, pch = ".") # pairs of back-transformed first three PCs

gnn documentation built on May 29, 2024, 6:13 a.m.

Related to trafos_dimreduction in gnn...