FOBIasymp: Testing for the Number of Gaussian Components in NGCA or ICA...

View source: R/FOBIasymp.R

FOBIasympR Documentation

Testing for the Number of Gaussian Components in NGCA or ICA Using FOBI

Description

In non-gaussian component analysis (NGCA) and independent components analysis (ICA) gaussian components are considered as uninteresting. The function tests, based on FOBI, if there are p-k gaussian components where p is the dimension of the data. The function offers three different test versions.

Usage

FOBIasymp(X, k, type = "S3", model = "NGCA", method = "satterthwaite")

Arguments

X

numeric data matrix.

k

the number of non-gaussian components under the null.

type

which of the three tests to perform. Options are "S1", "S2" and "S3". For the differences see the details section.

model

What is the underlying assumption of the non-gaussian parts. Options are general "NGCA" model and "ICA" model.

method

if type = "S1" the teststatistic has as limiting distribution a weighted sum of chisquare distributions. To compute the p-value then the function used is pchisqsum. The method argument specifies which method pchisqsum uses for the computation. Options are "satterthwaite", "integration" and "saddlepoint".

Details

The function jointly diagonalizes the regular covariance and the matrix of fourth moments. Note however that in this case the matrix of fourth moments is not made consistent under the normal model by dividing it by p+2, as for example done by the function cov4 where p denotes the dimension of the data. Therefore the eigenvalues of this generalized eigenvector-eigenvalue problem which correspond to normally distributed components should be p+2.

Given eigenvalues d_1,...,d_p the function thus orders the components in decending order according to the values of (d_i-(p+2))^2.

Under the null it is then assumed that the first k interesting components are mutually independent and non-normal and the last p-k are gaussian.

Three possible tests are then available to test this null hypothesis for a sample of size n:

  1. type="S1": The test statistic T is the variance of the last p-k eigenvalues around p+2:

    T = n sum_{i=k+1}^p (d_i-(p+2))^2,

    the limiting distribution of which under the null is the sum of two weighted chisquare distributions with weights:

    w_1 = 2*sigma_1 / (p-k) and w_2 = 2*sigma_1 / (p-k) + sigma_2.

    and degrees of freedom:

    df_1 = (p-k-1)(p-k+2)/2 and df_2 = 1.

  2. type="S2": Another possible version for the test statistic is a scaled sum of the variance of the eigenvalues around the mean plus the variance around the expected value under normality (p+2). Denote VAR_dpk as the variance of the last p-k eigenvalues and VAR2_dpk as the variance of these eigenvalues around p+2. Then the test statistic is:

    T = (n (p-k) VAR_dpk) / (2 sigma_1) + (n VAR2_dpk) / (2 sigma_1 / (p-k) + sigma_2).

    This test statistic has a limiting chisquare distribution with (p-k-1)(p-q+2)/2 + 1 degrees of freedom.

  3. type="S3": The third possible test statistic just checks the equality of the last p-k eigenvalues using only the first part of the test statistic of type="S2". The test statistic is then:

    T = (n (p-k) VAR_dpk) / (2 sigma_1)

    and has a limiting chisquare distribution with (p-k-1)(p-q+2)/2 degrees of freedom.

The constants sigma_1 and sigma_2 depend on the underlying model assumptions as specified in argument model and are estimated from the data.

Value

A list of class ictest inheriting from class htest containing:

statistic

the value of the test statistic.

p.value

the p-value of the test.

parameter

the degrees of freedom of the test or the degrees of freedoms and the corresponding weights of the test in case the test has as its limiting distribution a weighted sum of chisquare distributions.

method

character string denoting which test was performed.

data.name

character string giving the name of the data.

alternative

character string specifying the alternative hypothesis.

k

the number or non-gaussian components used in the testing problem.

W

the transformation matrix to the independent components. Also known as unmixing matrix.

S

data matrix with the centered independent components.

D

the underlying FOBI eigenvalues.

MU

the location of the data which was substracted before calculating the independent components.

sigma1

the asymptotic constant sigma1 needed for the asymptotic test(s).

sigma2

the asymptotic constant sigma2 needed for the asymptotic test(s).

type

the value of type.

model

the value of model.

Author(s)

Klaus Nordhausen

References

Nordhausen, K., Oja, H. and Tyler, D.E. (2022), Asymptotic and Bootstrap Tests for Subspace Dimension, Journal of Multivariate Analysis, 188, 104830. <doi:10.1016/j.jmva.2021.104830>.

Nordhausen, K., Oja, H., Tyler, D.E. and Virta, J. (2017), Asymptotic and Bootstrap Tests for the Dimension of the Non-Gaussian Subspace, Signal Processing Letters, 24, 887–891. <doi:10.1109/LSP.2017.2696880 >.

See Also

FOBI, FOBIboot

Examples

n <- 1500
S <- cbind(runif(n), rchisq(n, 2), rexp(n), rnorm(n), rnorm(n), rnorm(n))
A <- matrix(rnorm(36), ncol = 6)
X <- S %*% t(A)

FOBIasymp(X, k = 2)
FOBIasymp(X, k = 3, type = "S1")
FOBIasymp(X, k = 0, type = "S2", model = "ICA")

ICtest documentation built on May 18, 2022, 9:05 a.m.