NGPPsim | R Documentation |
Tests whether the true dimension of the signal subspace is less than or equal to a given k. The test statistic is a multivariate extension of the classical Jarque-Bera statistic and the distribution of it under the null hypothesis is obtained by simulation.
NGPPsim(X, k, nl = c("skew", "pow3"), alpha = 0.8, N = 1000, eps = 1e-6, verbose = FALSE, maxiter = 100)
X |
Numeric matrix with n rows corresponding to the observations and p columns corresponding to the variables. |
k |
Number of components to estimate, 1 <= k <= p. |
nl |
Vector of non-linearities, a convex combination of the corresponding squared objective functions of which is then used as the projection index. The choices include |
alpha |
Vector of positive weights between 0 and 1 given to the non-linearities. The length of |
N |
Number of normal samples to be used in simulating the distribution of the test statistic under the null hypothesis. |
eps |
Convergence tolerance. |
verbose |
If |
maxiter |
Maximum number of iterations. |
It is assumed that the data is a random sample from the model x = m + A s where the latent vector s = (s_1', s_2')' consists of k-dimensional non-Gaussian subvector (the signal) and p - k-dimensional Gaussian subvector (the noise) and the components of s are mutually independent. Without loss of generality we further assume that the components of s have zero means and unit variances.
To test the null hypothesis H0: k_true <= k the algorithm first estimates k + 1 components using delfation-based NGPP with the chosen non-linearities and weighting. Under the null hypothesis the distribution of the final p - k components is standard multivariate normal and the significance of the test is obtained by comparing the objective function value of the (k + 1)th estimated components to the same quantity estimated from N
samples of size n from (p - k)-dimensional standard multivariate normal distribution.
Note that if maxiter
is reached at any step of the algorithm it will use the current estimated direction and continue to the next step.
A list with class 'ictest', inheriting from the class 'hctest', containing the following components:
statistic |
Test statistic, i.e. the objective function value of the ( |
p.value |
Obtained p-value. |
parameter |
Number |
method |
Character string denoting which test was performed. |
data.name |
Character string giving the name of the data. |
alternative |
Alternative hypothesis, i.e. |
k |
Tested dimension |
W |
Estimated unmixing matrix |
S |
Matrix of size n x (k + 1) containing the estimated signals. |
D |
Vector of the objective function values of the signals |
MU |
Location vector of the data which was substracted before estimating the signal components. |
Joni Virta
Virta, J., Nordhausen, K. and Oja, H., (2016), Projection Pursuit for non-Gaussian Independent Components, <https://arxiv.org/abs/1612.05445>.
NGPP, NGPPest
# Simulated data with 2 signals and 2 noise components n <- 200 S <- cbind(rexp(n), rbeta(n, 1, 2), rnorm(n), rnorm(n)) A <- matrix(rnorm(16), ncol = 4) X <- S %*% t(A) # The number of simulations N should be increased in practical situations # Now we settle for N = 100 res1 <- NGPPsim(X, 1, N = 100) res1 screeplot(res1) res2 <- NGPPsim(X, 2, N = 100) res2 screeplot(res2)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.