table1sim.parallel: Replicate the experiment presented in Cerioli et al. (2009)
In christopherggreen/HardinRockeExtensionSimulations: Replication and Extension of Hardin and Rocke (2005), Cerioli et al. (2009)

Description Usage Arguments Details Value Note Author(s) References Examples

Replicates the experiment presented in Cerioli et al. (2009), Table 1, for a wider variety of estimators.

1	table1sim.parallel(cl, p, nn, N, B = 10000, alpha = c(0.01, 0.025, 0.05), cutoff.method = "GM14", lgf = "")

`cl`	A cluster object, e.g., returned from `makePSOCKcluster`. The user must create this object before calling `hrSimNewParallel`.
`p`	The dimension of the data used in each simulated run.
`nn`	The number of observations used in each simulated run.
`N`	The number of simulations to run.
`B`	The batch/block size: the number of simulations to run in each block. This is useful when running very large simulation runs (`N` very large) where memory is a concern.
`alpha`	The significance level to use for detecting outliers. Can be a vector; the outlier detection tests will be run at each level.
`cutoff.method`	String indicating with asymptotic distribution to use for the MCD-based distances. Valid values are `"HR05"` for the method of Hardin and Rocke (2005), and `"GM14"` for the method of Green and Martin (2014). Default is `"GM14"`.
`lgf`	Path to log file into which logging information should be written.

This is a work function designed for use in replicating Table 1 of Cerioli et al. (2009), page XXX, but using the asymptotic method of Green and Martin (2014) instead of the Hardin-Rocke method. The experiment investigates how many false-positives certain Mahalanobis-based tests of outlyingness produce, compared to the nominal Type I error rate α.

Internally the simulation function does B runs at a time. Blocks of size B are distributed across the cluster. Set B smaller if your machines have less memory or you have lots of cluster nodes.

An array of dimension 3:

The results of each of the N simulation runs appear along the first dimension.

The various estimators and tests appear along the second dimension. Currently the results appear in the following order.

Column Name	Covariate Estimate	Test Statistic
"OGK"	OGK estimate	chi-squared
"ROGK"	Reweighted OGK estimate	chi-squared
"SEST.BS"	S-estimate using bisquare	chi-squared
"SEST.RK"	S-estimate using Rocke	chi-squared
"MCD50.RAW"	MCD(0.5)	chi-squared
"MCD50.HRRAW"	MCD(0.5)	Hardin-Rocke
"MCD50.HRADJ"	MCD(0.5)	Hardin-Rocke (adj.)
"RMCD50"	reweighted MCD(0.5)	chi-squared
"MCD75.RAW"	MCD(0.75)	chi-squared
"MCD75.HRRAW"	MCD(0.75)	Hardin-Rocke
"MCD75.HRADJ"	MCD(0.75)	Hardin-Rocke (adj.)
"RMCD75"	reweighted MCD(0.75)	chi-squared
"MCD95.RAW"	MCD(0.95)	chi-squared
"MCD95.HRRAW"	MCD(0.95)	Hardin-Rocke
"MCD95.HRADJ"	MCD(0.95)	Hardin-Rocke (adj.)
"RMCD95"	reweighted MCD(0.95)	chi-squared

The adjusted versions of the Hardin-Rocke tests remove the finite sample correction when the sample size is 100 or greater. (WHY DID WE DO THIS)

The specified values of alpha correspond to the third dimension; the dimnames will be of the form “alpha” + alpha.

This version is deprecated.

Written and maintained by Christopher G. Green <christopher.g.green@gmail.com>

Andrea Cerioli, Marco Riani, and Anthony C. Atkinson. Controlling the size of multivariate outlier tests with the mcd estimator of scatter. Statistical Computing, 19:341-353, 2009.

C. G. Green and R. Douglas Martin. An extension of a method of Hardin and Rocke, with an application to multivariate outlier detection via the IRMCD method of Cerioli. Working Paper, 2014. Available from http://students.washington.edu/cggreen/uwstat/papers/cerioli_extension.pdf

J. Hardin and D. M. Rocke. The distribution of robust distances. Journal of Computational and Graphical Statistics, 14:928-946, 2005.

  ## Not run: 
    # this runs an experiment
	# assumes a cluster
	# the vignette provides a better recipe for 
	# replicating Cerioli et al. (2009)

    require( parallel                )
    require( CerioliOutlierDetection )
    require( HardinRockeExtensionSimulations    )
    
	# we use a socket cluster on Windows,
	# change to your preferred method of
	# creating a cluster
    thecluster <- makePSOCKcluster(4)

    N.SIM <- 500
    B.SIM <- 50
    
    # initialize each node
    tmp.rv <- clusterEvalQ( cl = thecluster, {
    
      require(abind,                              quietly=TRUE)
      require(rrcov,                              quietly=TRUE)
      require(mvtnorm,                            quietly=TRUE)
      require(CerioliOutlierDetection,            quietly=TRUE)
      require(HardinRockeExtensionSimulations,    quietly=TRUE)
    
      Sys.sleep(30)
    
      invisible(NULL)
    })
    
    results <- table1sim.parallel(cl=thecluster, p = 4, nn = 300, 
          N=500, B=50, lgf=logfile)
    stopCluster(thecluster)

    # calculate some statistics 
    apply(results,c(2,3),mean),
    apply(results,c(2,3),sd)
  
## End(Not run)

christopherggreen/HardinRockeExtensionSimulations documentation built on May 13, 2019, 7:04 p.m.

christopherggreen/HardinRockeExtensionSimulations index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

christopherggreen/HardinRockeExtensionSimulations
Replication and Extension of Hardin and Rocke (2005), Cerioli et al. (2009)

table1sim.parallel: Replicate the experiment presented in Cerioli et al. (2009)
In christopherggreen/HardinRockeExtensionSimulations: Replication and Extension of Hardin and Rocke (2005), Cerioli et al. (2009)

Description

Usage

Arguments

Details

Value

Note

Author(s)

References

Examples

Related to table1sim.parallel in christopherggreen/HardinRockeExtensionSimulations...

R Package Documentation

Browse R Packages

We want your feedback!

christopherggreen/HardinRockeExtensionSimulations Replication and Extension of Hardin and Rocke (2005), Cerioli et al. (2009)

table1sim.parallel: Replicate the experiment presented in Cerioli et al. (2009) In christopherggreen/HardinRockeExtensionSimulations: Replication and Extension of Hardin and Rocke (2005), Cerioli et al. (2009)

Description

Usage

Arguments

Details

Value

Note

Author(s)

References

Examples

Related to table1sim.parallel in christopherggreen/HardinRockeExtensionSimulations...

R Package Documentation

Browse R Packages

We want your feedback!

christopherggreen/HardinRockeExtensionSimulations
Replication and Extension of Hardin and Rocke (2005), Cerioli et al. (2009)

table1sim.parallel: Replicate the experiment presented in Cerioli et al. (2009)
In christopherggreen/HardinRockeExtensionSimulations: Replication and Extension of Hardin and Rocke (2005), Cerioli et al. (2009)