table13sim.parallel.check: Replicate part of the experiment presented in Cerioli et al....

Description Usage Arguments Details Value Author(s) References See Also Examples

Description

Replicates a small part the experiment presented in Cerioli et al. (2009), Tables 1 and 3, for the MCD using the maximum breakdown point fraction and a fraction of exactly 0.5.

Usage

1
table13sim.parallel.check(cl, p, nn, N, B = 250, alpha = c(0.01, 0.025, 0.05), lgf = "", mlgf = "")

Arguments

cl

A cluster object, e.g., returned from makePSOCKcluster. The user must create this object before calling table13sim.parallel.

p

The dimension of the data used in each simulated run.

nn

The number of observations used in each simulated run.

N

The number of simulations to run.

B

The batch/block size: the number of simulations to run in each block. This is useful when running very large simulation runs (N very large) where memory is a concern. Blocks are distributed across cluster nodes, so this is also a means of controlling the workload on each node.

alpha

The significance level to use for detecting outliers. Can be a vector; the outlier detection tests will be run at each level.

lgf

Path to log file into which logging information should be written.

mlgf

not used at this time

Details

This is a variant of table13sim.parallel designed to investigate differences between the outlier detection tests with the MCD when the data fraction is (a) the maximum breakdown point and (b) exactly 0.5. It also checks whether the small-sample correction was used in the results of Hardin and Rocke (2005) and Cerioli et al. (2009).

This function is not really useful to anyone other than the author, and is not supported. Do not use it.

Value

An array of dimension 3:

  1. The results of each of the N simulation runs appear along the first dimension.

  2. The various estimators and tests appear along the second dimension. Results with suffix “T1” correspond to Table 1 of Cerioli et al. (2009) (the individual outlier tests) while those with suffix “T3” correspond to Table 3 (the simultaneous outlier tests). Currently the 26 columns appear in the following order.

    Column Name Covariate Estimate Test Statistic
    "MCDMBP.RAW.T1" MCD (max. breadown pt.) chi-squared
    "MCDMBP.RAWGM.T1" MCD (max. breadown pt.) Green-Martin
    "MCDMBP.RAWHR.T1" MCD (max. breadown pt.) Hardin-Rocke
    "MCDMBP.RAWNOSSGM.T1" MCD (max. breadown pt.), no small sample correction Green-Martin
    "MCDMBP.RAWNOSSHR.T1" MCD (max. breadown pt.), no small sample correction Hardin-Rocke
    "RMCDMBP.T1" reweighted MCD (max. breadown pt.) chi-squared
    "MCDMBP.RAW.T3" MCD (max. breadown pt.) chi-squared
    "MCDMBP.RAWGM.T3" MCD (max. breadown pt.) Green-Martin
    "MCDMBP.RAWHR.T3" MCD (max. breadown pt.) Hardin-Rocke
    "MCDMBP.RAWNOSSGM.T3" MCD (max. breadown pt.), no small sample correction Green-Martin
    "MCDMBP.RAWNOSSHR.T3" MCD (max. breadown pt.), no small sample correction Hardin-Rocke
    "RMCDMBP.T3" reweighted MCD (max. breadown pt.) chi-squared
    "RMCDMBP.CH.T3" reweighted MCD (max. breadown pt.) with Bonferroni correction chi-squared
    "MCD50.RAW.T1" MCD(0.50) chi-squared
    "MCD50.RAWGM.T1" MCD(0.50) Green-Martin
    "MCD50.RAWHR.T1" MCD(0.50) Hardin-Rocke
    "MCD50.RAWNOSSGM.T1" MCD(0.50), no small sample correction Green-Martin
    "MCD50.RAWNOSSHR.T1" MCD(0.50), no small sample correction Hardin-Rocke
    "RMCD50.T1" reweighted MCD(0.50) chi-squared
    "MCD50.RAW.T3" MCD(0.50) chi-squared
    "MCD50.RAWGM.T3" MCD(0.50) Green-Martin
    "MCD50.RAWHR.T3" MCD(0.50) Hardin-Rocke
    "MCD50.RAWNOSSGM.T3" MCD(0.50), no small sample correction Green-Martin
    "MCD50.RAWNOSSHR.T3" MCD(0.50), no small sample correction Hardin-Rocke
    "RMCD50.T3" reweighted MCD(0.50) chi-squared
    "RMCD50.CH.T3" reweighted MCD(0.50) with Bonferroni correction chi-squared
  3. The specified values of alpha correspond to the third dimension; the dimnames will be of the form “alpha” + alpha.

Author(s)

Written and maintained by Christopher G. Green <christopher.g.green@gmail.com>

References

Andrea Cerioli, Marco Riani, and Anthony C. Atkinson. Controlling the size of multivariate outlier tests with the mcd estimator of scatter. Statistical Computing, 19:341-353, 2009.

C. G. Green and R. Douglas Martin. An extension of a method of Hardin and Rocke, with an application to multivariate outlier detection via the IRMCD method of Cerioli. Working Paper, 2014. Available from http://students.washington.edu/cggreen/uwstat/papers/cerioli_extension.pdf

J. Hardin and D. M. Rocke. The distribution of robust distances. Journal of Computational and Graphical Statistics, 14:928-946, 2005.

See Also

table1sim.parallel CovMcd2

Examples

1
2
3
4
  ## Not run: 
    #
  
## End(Not run)

christopherggreen/HardinRockeExtension documentation built on May 13, 2019, 7:04 p.m.