NGPPest | R Documentation |
Estimates the dimension of the signal subspace using NGPP to conduct sequential hypothesis testing. The test statistic is a multivariate extension of the classical Jarque-Bera statistic and the distribution of it under the null hypothesis is obtained by simulation.
NGPPest(X, nl = c("skew", "pow3"), alpha = 0.8, N = 500, eps = 1e-6,
verbose = FALSE, maxiter = 100)
X |
Numeric matrix with n rows corresponding to the observations and p columns corresponding to the variables. |
nl |
Vector of non-linearities, a convex combination of the corresponding squared objective functions of which is then used as the projection index. The choices include |
alpha |
Vector of positive weights between 0 and 1 given to the non-linearities. The length of |
N |
Number of normal samples to be used in simulating the distribution of the test statistic under the null hypothesis. |
eps |
Convergence tolerance. |
verbose |
If |
maxiter |
Maximum number of iterations. |
It is assumed that the data is a random sample from the model x = m + A s
where the latent vector s = (s_1^T, s_2^T)^T
consists of k
-dimensional non-Gaussian subvector (the signal) and p - k
-dimensional Gaussian subvector (the noise) and the components of s
are mutually independent. Without loss of generality we further assume that the components of s
have zero means and unit variances.
The algorithm first estimates full p
components from the data using deflation-based NGPP with the chosen non-linearities and weighting and then tests the null hypothesis H_0: k_{true} \leq k
for each k = 0, \ldots , p - 1
. The testing is based on the fact that under the null hypothesis H_0: k_{true} \leq k
the distribution of the final p - k
components is standard multivariate normal and the significance of the test can be obtained by comparing the objective function value of the (k + 1)
th estimated components to the same quantity estimated from N
samples of size n
from (p - k)
-dimensional standard multivariate normal distribution.
Note that if maxiter
is reached at any step of the algorithm it will use the current estimated direction and continue to the next step.
A list with class 'icest' containing the following components:
statistic |
Test statistic, i.e. the objective function values of all estimated component. |
p.value |
Obtained vector of |
parameter |
Number |
method |
Character string |
data.name |
Character string giving the name of the data. |
W |
Estimated unmixing matrix |
S |
Matrix of size |
D |
Vector of the objective function values of the signals |
MU |
Location vector of the data which was substracted before estimating the signal components. |
conv |
Boolean vector telling for which components the algorithm converged ( |
Joni Virta
Virta, J., Nordhausen, K. and Oja, H., (2016), Projection Pursuit for non-Gaussian Independent Components, <https://arxiv.org/abs/1612.05445>.
NGPP, NGPPsim
# Iris data
X <- as.matrix(iris[, 1:4])
# The number of simulations N should be increased in practical situations
# Now we settle for N = 100
res <- NGPPest(X, N = 100)
res$statistic
res$p.value
res$conv
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.