View source: R/comp.simu.test.R
comp.simu.test | R Documentation |
Identifies invariant coordinates that are nonnormal using simulations under a standard multivariate normal model for a specific data setup and scatter combination.
comp.simu.test(object, m = 10000, type = "smallprop", level = 0.05,
adjust = TRUE, ncores = NULL, iseed = NULL, pkg = "ICSOutlier",
qtype = 7, ...)
object |
object of class |
m |
number of simulations. Note that since extreme quantiles are of interest |
type |
currently the only type option is |
level |
the initial level used to make a decision. The cut-off values are the (1- |
adjust |
logical. If |
ncores |
number of cores to be used. If |
iseed |
If parallel computation is used the seed passed on to |
pkg |
When using parallel computing, a character vector listing all the packages which need to be loaded on the different cores via |
qtype |
specifies the quantile algorithm used in |
... |
further arguments passed on to the function |
Based on simulations it detects which of the components follow a univariately normal distribution. More precisely it identifies the observed eigenvalues larger than the ones coming
from normal distributed data. m
standard normal data sets are simulated using the same data size and scatters as specified in the ics2
object.
The cut-off values are determined based on a quantile of these simulated eigenvalues.
As the eigenvalues, aka generalized kurtosis values, of ICS are ordered it is natural to perform the comparison in a specific order depending on the purpose.
Currently the only available type
is "smallprop"
so starting with the first component, the observed eigenvalues are successively compared to
these cut-off values. The precedure stops when an eigenvalue is below the corresponding cut-off, so when a normal component is detected.
If adjust = FALSE
all eigenvalues are compared to the same (1-level
)th level of the quantile. This leads however often to too many selected components.
Therefore some multiple testing adjustment might be useful. The current default adjusts the quantile for the jth component as 1-level
/j.
Note that depending on the data size and scatters used this can take a while and so it is more efficient to parallelize computations.
Note also that the function is seldomly called directly by the user but internally by ics.outlier
.
A list containing:
index |
integer vector indicating the indices of the selected components. |
test |
string |
criterion |
vector of the cut-off values for all the eigenvalues. |
levels |
vector of the levels used to derive the cut-offs for each component. |
adjust |
logical. |
type |
|
m |
number of iterations |
Function comp.simu.test
reached the end of its lifecycle, please use comp_simu_test()
instead. In future versions, comp.simu.test
will be deprecated and eventually removed.
Aurore Archimbaud and Klaus Nordhausen
Archimbaud, A., Nordhausen, K. and Ruiz-Gazen, A. (2018), ICS for multivariate outlier detection with application to quality control. Computational Statistics & Data Analysis, 128:184-199. ISSN 0167-9473. <https://doi.org/10.1016/j.csda.2018.06.011>.
ics2
, comp.norm.test
# For a real analysis use larger values for m and more cores if available
set.seed(123)
Z <- rmvnorm(1000, rep(0, 6))
# Add 20 outliers on the first component
Z[1:20, 1] <- Z[1:20, 1] + 10
pairs(Z)
icsZ <- ics2(Z)
# For demo purpose only small m value, should select the first component
comp.simu.test(icsZ, m = 400, ncores = 1)
## Not run:
# For using two cores
# For demo purpose only small m value, should select the first component
comp.simu.test(icsZ, m = 500, ncores = 2, iseed = 123)
# For using several cores and for using a scatter function from a different package
# Using the parallel package to detect automatically the number of cores
library(parallel)
# ICS with MCD estimates and the usual estimates
# Need to create a wrapper for the CovMcd function to return first the location estimate
# and the scatter estimate secondly.
library(rrcov)
myMCD <- function(x,...){
mcd <- CovMcd(x,...)
return(list(location = mcd@center, scatter = mcd@cov))
}
icsZmcd <- ics2(Z, S1 = myMCD, S2 = MeanCov, S1args = list(alpha = 0.75))
# For demo purpose only small m value, should select the first component
comp.simu.test(icsZmcd, m = 500, ncores = detectCores()-1,
pkg = c("ICSOutlier", "rrcov"), iseed = 123)
## End(Not run)
# Example with no outlier
Z0 <- rmvnorm(1000, rep(0, 6))
pairs(Z0)
icsZ0 <- ics2(Z0)
#Should select no component
comp.simu.test(icsZ0, m = 400, level = 0.01, ncores = 1)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.