RobEstControl-class: Class 'RobEstControl' - contains control parameters for the...

RobEstControl-classR Documentation

Class 'RobEstControl' - contains control parameters for the robust estimation of parametric interval data models.

Description

This class extends the CovControlMcd class and contains control parameters for the robust estimation of parametric interval data models.

Objects from the Class

Objects can be created by calls of the form new("RobEstControl", ...) or by calling the constructor-function RobEstControl.

Slots

alpha:

Inherited from class "CovControlMcd". Numeric parameter controlling the size of the subsets over which the trimmed likelihood is maximized; roughly alpha*nrow(Sdt) observations are used for computing the trimmed likelihood. Allowed values are between 0.5 and 1. Note that when argument ‘getalpha’ is set to “TwoStep” the final value of ‘alpha’ is estimated by a two-step procedure and the value of argument ‘alpha’ is only used to specify the size of the samples used in the first step.

nsamp:

Inherited from class "CovControlMcd". Number of subsets used for initial estimates.

scalefn:

Inherited from class "CovControlMcd" and not used in the package ‘Maint.Data.’

maxcsteps:

Inherited from class "CovControlMcd" and not used in the package ‘Maint.Data.’

seed:

Inherited from class "CovControlMcd". Starting value for random generator. Default is seed = NULL.

use.correction:

Inherited from class "CovControlMcd". Whether to use finite sample correction factors. Default is use.correction=TRUE.

trace, tolSolve:

Inherited from class "CovControl".

ncsteps:

The maximum number of concentration steps used each iteration of the fasttle algorithm.

getalpha:

Argument specifying if the ‘alpha’ parameter (roughly the percentage of the sample used for computing the trimmed likelihood) should be estimated from the data, or if the value of the argument ‘alpha’ should be used instead. When set to “TwoStep”, ‘alpha’ is estimated by a two-step procedure with the value of argument ‘alpha’ specifying the size of the samples used in the first step. Otherwise, with the value of argument ‘alpha’ is used directly.

rawMD2Dist:

The assumed reference distribution of the raw MCD squared distances, which is used to find to cutoffs defining the observations kept in one-step reweighted MCD estimates. Alternatives are ‘ChiSq’,‘HardRockeAsF’ and ‘HardRockeAdjF’, respectivelly for the usual Chi-squared, and the asymptotic and adjusted scaled F distributions proposed by Hardin and Rocke (2005).

MD2Dist:

The assumed reference distributions used to find cutoffs defining the observations assumed as outliers. Alternatives are “ChiSq” and “CerioliBetaF” respectivelly for the usual Chi-squared, and the Beta and F distributions proposed by Cerioli (2010).

eta:

Nominal size of the null hypothesis that a given observation is not an outlier. Defines the raw MCD Mahalanobis distances cutoff used to choose the observations kept in the reweightening step.

multiCmpCor:

Whether a multicomparison correction of the nominal size (eta) for the outliers tests should be performed. Alternatives are: ‘never’ – ignoring the multicomparisons and testing all entities at ‘eta’. ‘always’ – testing all n entitites at 1.- (1.-‘eta’^(1/n)); and ‘iterstep’ – as suggested by Cerioli (2010), make an initial set of tests using the nominal size 1.- (1-‘eta’^(1/n)), and if no outliers were detected stop. Otherwise, make a second step testing for outliers at ‘eta’.

getkdblstar:

Argument specifying the size of the initial small (in order to minimize the probability of outliers) subsets. If set to the string “Twopplusone” (default) the initial sets have twice the number of interval-value variables plus one (i.e., they are the smaller samples that lead to a non-singular covariance estimate). Otherwise, an integer with the size of the initial sets.

k2max:

Maximal allowed l2-norm condition number for correlation matrices. Correlation matrices with condition number above k2max are considered to be numerically singular, leading to degenerate results.

outlin:

The type of outliers to be consideres. “MidPandLogR” if outliers may be present in both MidPpoints and LogRanges, “MidP” if outliers are only present in MidPpoints, or “LogR” if outliers are only present in LogRanges.

trialmethod:

The method to find a trial subset used to initialize each replication of the fasttle algorithm. The current options are “simple” (default) that simply selects ‘kdblstar’ observations at random, and “Poolm” that divides the original sample into ‘m’ non-overlaping subsets, applies the ‘simple trial’ and the refinement methods to each one of them, and merges the results into a trial subset.

m:

Number of non-overlaping subsets used by the trial method when the argument of ‘trialmethod’ is set to 'Poolm'.

reweighted:

Should a (Re)weighted estimate of the covariance matrix be used in the computation of the trimmed likelihood or just a “raw” covariance estimate; default is (Re)weighting.

otpType:

The amount of output returned by fasttle. Current options are “OnlyEst” (default) where only an ‘IdtE’ object with the fasttle estimates is returned, “SetMD2andEst” which returns a list with an ‘IdtE’ object of fasttle estimates, a vector with the final trimmed subset elements used to compute these estimates and the corresponding robust squared Mahalanobis distances, and “SetMD2EstandPrfSt” wich returns a list with the previous three components plust a list of some performance statistics concerning the algorithm execution.

Extends

Class CovControlMcd, directly. Class CovControl by CovControlMcd, distance 2.

Methods

No methods defined with class "RobEstControl" in the signature.

References

Cerioli, A. (2010), Multivariate Outlier Detection with High-Breakdown Estimators. Journal of the American Statistical Association 105 (489), 147–156.

Duarte Silva, A.P., Filzmoser, P. and Brito, P. (2017), Outlier detection in interval data. Advances in Data Analysis and Classification, 1–38.

Hardin, J. and Rocke, A. (2005), The Distribution of Robust Distances. Journal of Computational and Graphical Statistics 14, 910–927.

Todorov V. and Filzmoser P. (2009), An Object Oriented Framework for Robust Multivariate Analysis. Journal of Statistical Software 32(3), 1–47.

See Also

RobEstControl, fasttle, RobMxtDEst, Roblda, Robqda


MAINT.Data documentation built on April 4, 2023, 9:09 a.m.