NGPPest | R Documentation |
Estimates the dimension of the signal subspace using NGPP to conduct sequential hypothesis testing. The test statistic is a multivariate extension of the classical Jarque-Bera statistic and the distribution of it under the null hypothesis is obtained by simulation.
NGPPest(X, nl = c("skew", "pow3"), alpha = 0.8, N = 500, eps = 1e-6, verbose = FALSE, maxiter = 100)
X |
Numeric matrix with n rows corresponding to the observations and p columns corresponding to the variables. |
nl |
Vector of non-linearities, a convex combination of the corresponding squared objective functions of which is then used as the projection index. The choices include |
alpha |
Vector of positive weights between 0 and 1 given to the non-linearities. The length of |
N |
Number of normal samples to be used in simulating the distribution of the test statistic under the null hypothesis. |
eps |
Convergence tolerance. |
verbose |
If |
maxiter |
Maximum number of iterations. |
It is assumed that the data is a random sample from the model x = m + A s where the latent vector s = (s_1', s_2')' consists of k-dimensional non-Gaussian subvector (the signal) and p - k-dimensional Gaussian subvector (the noise) and the components of s are mutually independent. Without loss of generality we further assume that the components of s have zero means and unit variances.
The algorithm first estimates full p components from the data using deflation-based NGPP with the chosen non-linearities and weighting and then tests the null hypothesis H0: k_true <= k for each k = 0, ..., p -1. The testing is based on the fact that under the null hypothesis H0: k_true <= k the distribution of the final p - k components is standard multivariate normal and the significance of the test can be obtained by comparing the objective function value of the (k + 1)th estimated components to the same quantity estimated from N
samples of size n from (p - k)-dimensional standard multivariate normal distribution.
Note that if maxiter
is reached at any step of the algorithm it will use the current estimated direction and continue to the next step.
A list with class 'icest' containing the following components:
statistic |
Test statistic, i.e. the objective function values of all estimated component. |
p.value |
Obtained vector of p-values. |
parameter |
Number |
method |
Character string |
data.name |
Character string giving the name of the data. |
W |
Estimated unmixing matrix |
S |
Matrix of size n x p containing the estimated signals. |
D |
Vector of the objective function values of the signals |
MU |
Location vector of the data which was substracted before estimating the signal components. |
conv |
Boolean vector telling for which components the algorithm converged ( |
Joni Virta
Virta, J., Nordhausen, K. and Oja, H., (2016), Projection Pursuit for non-Gaussian Independent Components, <https://arxiv.org/abs/1612.05445>.
NGPP, NGPPsim
# Iris data X <- as.matrix(iris[, 1:4]) # The number of simulations N should be increased in practical situations # Now we settle for N = 100 res <- NGPPest(X, N = 100) res$statistic res$p.value res$conv
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.