NGPPest: Signal Subspace Dimension Testing Using non-Gaussian...

View source: R/NGPPest.R

NGPPestR Documentation

Signal Subspace Dimension Testing Using non-Gaussian Projection Pursuit

Description

Estimates the dimension of the signal subspace using NGPP to conduct sequential hypothesis testing. The test statistic is a multivariate extension of the classical Jarque-Bera statistic and the distribution of it under the null hypothesis is obtained by simulation.

Usage

NGPPest(X, nl = c("skew", "pow3"), alpha = 0.8, N = 500, eps = 1e-6,
        verbose = FALSE, maxiter = 100)

Arguments

X

Numeric matrix with n rows corresponding to the observations and p columns corresponding to the variables.

nl

Vector of non-linearities, a convex combination of the corresponding squared objective functions of which is then used as the projection index. The choices include "skew" (skewness), "pow3" (excess kurtosis), "tanh" (log(cosh)) and "gauss" (Gaussian function).

alpha

Vector of positive weights between 0 and 1 given to the non-linearities. The length of alpha should be either one less than the number of non-linearities in which case the missing weight is chosen so that alpha sums to one, or equal to the number of non-linearities in which case the weights are used as such. No boundary checks for the weights are done.

N

Number of normal samples to be used in simulating the distribution of the test statistic under the null hypothesis.

eps

Convergence tolerance.

verbose

If TRUE the numbers of iterations will be printed.

maxiter

Maximum number of iterations.

Details

It is assumed that the data is a random sample from the model x = m + A s where the latent vector s = (s_1', s_2')' consists of k-dimensional non-Gaussian subvector (the signal) and p - k-dimensional Gaussian subvector (the noise) and the components of s are mutually independent. Without loss of generality we further assume that the components of s have zero means and unit variances.

The algorithm first estimates full p components from the data using deflation-based NGPP with the chosen non-linearities and weighting and then tests the null hypothesis H0: k_true <= k for each k = 0, ..., p -1. The testing is based on the fact that under the null hypothesis H0: k_true <= k the distribution of the final p - k components is standard multivariate normal and the significance of the test can be obtained by comparing the objective function value of the (k + 1)th estimated components to the same quantity estimated from N samples of size n from (p - k)-dimensional standard multivariate normal distribution.

Note that if maxiter is reached at any step of the algorithm it will use the current estimated direction and continue to the next step.

Value

A list with class 'icest' containing the following components:

statistic

Test statistic, i.e. the objective function values of all estimated component.

p.value

Obtained vector of p-values.

parameter

Number N of simulated normal samples.

method

Character string "Estimation the signal subspace dimension using NGPP".

data.name

Character string giving the name of the data.

W

Estimated unmixing matrix

S

Matrix of size n x p containing the estimated signals.

D

Vector of the objective function values of the signals

MU

Location vector of the data which was substracted before estimating the signal components.

conv

Boolean vector telling for which components the algorithm converged (TRUE) and for which not (FALSE).

Author(s)

Joni Virta

References

Virta, J., Nordhausen, K. and Oja, H., (2016), Projection Pursuit for non-Gaussian Independent Components, <https://arxiv.org/abs/1612.05445>.

See Also

NGPP, NGPPsim

Examples

# Iris data

X <- as.matrix(iris[, 1:4])

# The number of simulations N should be increased in practical situations
# Now we settle for N = 100

res <- NGPPest(X, N = 100)
res$statistic
res$p.value
res$conv

ICtest documentation built on May 18, 2022, 9:05 a.m.