NGPPsim | R Documentation |
Tests whether the true dimension of the signal subspace is less than or equal to a given k
. The test statistic is a multivariate extension of the classical Jarque-Bera statistic and the distribution of it under the null hypothesis is obtained by simulation.
NGPPsim(X, k, nl = c("skew", "pow3"), alpha = 0.8, N = 1000, eps = 1e-6,
verbose = FALSE, maxiter = 100)
X |
Numeric matrix with n rows corresponding to the observations and p columns corresponding to the variables. |
k |
Number of components to estimate, |
nl |
Vector of non-linearities, a convex combination of the corresponding squared objective functions of which is then used as the projection index. The choices include |
alpha |
Vector of positive weights between 0 and 1 given to the non-linearities. The length of |
N |
Number of normal samples to be used in simulating the distribution of the test statistic under the null hypothesis. |
eps |
Convergence tolerance. |
verbose |
If |
maxiter |
Maximum number of iterations. |
It is assumed that the data is a random sample from the model x = m + A s
where the latent vector s = (s_1^T, s_2^T)^T
consists of k
-dimensional non-Gaussian subvector (the signal) and p - k
-dimensional Gaussian subvector (the noise) and the components of s
are mutually independent. Without loss of generality we further assume that the components of s
have zero means and unit variances.
To test the null hypothesis H_0: k_{true} \leq k
the algorithm first estimates k + 1
components using delfation-based NGPP with the chosen non-linearities and weighting. Under the null hypothesis the distribution of the final p - k
components is standard multivariate normal and the significance of the test is obtained by comparing the objective function value of the (k + 1)
th estimated components to the same quantity estimated from N
samples of size n
from (p - k)
-dimensional standard multivariate normal distribution.
Note that if maxiter
is reached at any step of the algorithm it will use the current estimated direction and continue to the next step.
A list with class 'ictest', inheriting from the class 'hctest', containing the following components:
statistic |
Test statistic, i.e. the objective function value of the ( |
p.value |
Obtained |
parameter |
Number |
method |
Character string denoting which test was performed. |
data.name |
Character string giving the name of the data. |
alternative |
Alternative hypothesis, i.e. |
k |
Tested dimension |
W |
Estimated unmixing matrix |
S |
Matrix of size |
D |
Vector of the objective function values of the signals |
MU |
Location vector of the data which was substracted before estimating the signal components. |
Joni Virta
Virta, J., Nordhausen, K. and Oja, H., (2016), Projection Pursuit for non-Gaussian Independent Components, <https://arxiv.org/abs/1612.05445>.
NGPP, NGPPest
# Simulated data with 2 signals and 2 noise components
n <- 200
S <- cbind(rexp(n), rbeta(n, 1, 2), rnorm(n), rnorm(n))
A <- matrix(rnorm(16), ncol = 4)
X <- S %*% t(A)
# The number of simulations N should be increased in practical situations
# Now we settle for N = 100
res1 <- NGPPsim(X, 1, N = 100)
res1
screeplot(res1)
res2 <- NGPPsim(X, 2, N = 100)
res2
screeplot(res2)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.