View source: R/comp.simu.test.R
comp_simu_test | R Documentation |
Identifies invariant coordinates that are nonnormal using simulations under a standard multivariate normal model for a specific data setup and scatter combination.
comp_simu_test(
object,
S1 = NULL,
S2 = NULL,
S1_args = list(),
S2_args = list(),
m = 10000,
type = "smallprop",
level = 0.05,
adjust = TRUE,
n_cores = NULL,
iseed = NULL,
pkg = "ICSOutlier",
q_type = 7,
...
)
object |
object of class |
S1 |
an object of class |
S2 |
an object of class |
S1_args |
a list containing additional arguments for |
S2_args |
a list containing additional arguments for |
m |
number of simulations. Note that since extreme quantiles are of interest |
type |
currently the only type option is |
level |
the initial level used to make a decision. The cut-off values are the (1- |
adjust |
logical. If |
n_cores |
number of cores to be used. If |
iseed |
If parallel computation is used the seed passed on to |
pkg |
When using parallel computing, a character vector listing all the packages which need to be loaded on the different cores via |
q_type |
specifies the quantile algorithm used in |
... |
further arguments passed on to the function |
Based on simulations it detects which of the components follow a univariately normal distribution. More precisely it identifies the observed eigenvalues larger than the ones coming
from normal distributed data. m
standard normal data sets are simulated using the same data size and scatters as specified in the "ICS"
object.
The cut-off values are determined based on a quantile of these simulated eigenvalues.
As the eigenvalues, aka generalized kurtosis values, of ICS are ordered it is natural to perform the comparison in a specific order depending on the purpose.
Currently the only available type
is "smallprop"
so starting with the first component, the observed eigenvalues are successively compared to these cut-off values. The precedure stops when an eigenvalue is below the corresponding cut-off, so when a normal component is detected.
If adjust = FALSE
all eigenvalues are compared to the same (1-level
)th level of the quantile. This leads however often to too many selected components.
Therefore some multiple testing adjustment might be useful. The current default adjusts the quantile for the jth component as 1-level
/j.
Note that depending on the data size and scatters used this can take a while and so it is more efficient to parallelize computations.
Note also that the function is seldomly called directly by the user but internally by ICS_outlier()
.
A list containing:
index
: integer vector indicating the indices of the selected components.
test
: string "simulation"
.
criterion
: vector of the cut-off values for all the eigenvalues.
levels
: vector of the levels used for the decision for each component.
adjust
: logical. TRUE
if adjusted.
type
: type
used
m
: number of iterations m
used in the simulations.
Aurore Archimbaud and Klaus Nordhausen
Archimbaud, A., Nordhausen, K. and Ruiz-Gazen, A. (2018), ICS for multivariate outlier detection with application to quality control. Computational Statistics & Data Analysis, 128:184-199. ISSN 0167-9473. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1016/j.csda.2018.06.011")}.
ICS(), comp_norm_test()
# For a real analysis use larger values for m and more cores if available
set.seed(123)
Z <- rmvnorm(1000, rep(0, 6))
# Add 20 outliers on the first component
Z[1:20, 1] <- Z[1:20, 1] + 10
pairs(Z)
icsZ <- ICS(Z)
# For demo purpose only small m value, should select the first component
comp_simu_test(icsZ, S1 = ICS_cov, S2= ICS_cov4, m = 400, n_cores = 1)
## Not run:
# For using two cores
# For demo purpose only small m value, should select the first component
comp_simu_test(icsZ, S1 = ICS_cov, S2 = ICS_cov4, m = 500, n_cores = 2, iseed = 123)
# For using several cores and for using a scatter function from a different package
# Using the parallel package to detect automatically the number of cores
library(parallel)
# ICS with MCD estimates and the usual estimates
library(ICSClust)
icsZmcd <- ICS(Z, S1 = ICS_mcd_raw, S2 = ICS_cov, S1_args = list(alpha = 0.75))
# For demo purpose only small m value, should select the first component
comp_simu_test(icsZmcd, S1 = ICS_mcd_raw, S2 = ICS_cov,
S1_args = list(alpha = 0.75, location = TRUE),
m = 500, ncores = detectCores()-1,
pkg = c("ICSOutlier", "ICSClust"), iseed = 123)
## End(Not run)
# Example with no outlier
Z0 <- rmvnorm(1000, rep(0, 6))
pairs(Z0)
icsZ0 <- ICS(Z0)
# Should select no component
comp_simu_test(icsZ0,S1 = ICS_cov, S2 = ICS_cov4, m = 400, level = 0.01, n_cores = 1)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.