| ssMRCD | R Documentation |
The ssMRCD function calculates the spatially smoothed MRCD estimator from Puchhammer and Filzmoser (2023).
ssMRCD(
X,
groups = NULL,
weights,
lambda = 0.5,
tuning = list(method = NULL, plot = FALSE, k = 10, repetitions = 5, cont = 0.05),
TM = NULL,
alpha = 0.75,
maxcond = 50,
maxcsteps = 200,
n_initialhsets = NULL
)
X |
a list of matrices containing the observations per neighborhood sorted, or matrix or data frame containing data. If matrix or data.frame, group vector has to be given. |
groups |
vector of neighborhood assignments |
weights |
weighting matrix, symmetrical, rows sum up to one and diagonals need to be zero (see also |
lambda |
numeric between 0 and 1. |
tuning |
default NULL. List of tuning specifications if lambda contains more than one value. See Details. |
TM |
target matrix (optional), default value is the covMcd from robustbase. |
alpha |
numeric, proportion of values included, between 0.5 and 1. |
maxcond |
optional, maximal condition number used for rho-estimation. |
maxcsteps |
maximal number of c-steps before algorithm stops. |
n_initialhsets |
number of initial h-sets, default is 6 times number of neighborhoods. |
The necessary list elements for the parameter tuning depend on the method specified.
For both tuning approaches (residual-based or contamination-based) the element method needs to be specified to
"residuals" and "local contamination", respectively. The boolean list element plot is available for both methods and
specifies if a plot should be constructed after tuning.
For tuning$method = "local contamination", additional information needs to be passed.
The number of nearest neighbors tuning$k used for the local outlier detection method
is 10 by default. The percentage of exchanged/contaminated observations is specified
by tuning$cont and is set to 0.05 by default. Also the coordinates must be given in tuning$coords
and the number of repetitions for the switching procedure, tuning$repetitions.
For tuning$method = "local contamination" no optimal value is returned but the choice has to
be made by the user. Be aware that the FNR does not take into account that there are also natural outliers
included in the data set that might or might not be found. The best parameter selection depends on the goal of the analysis
and whether false negatives should be avoided or whether the number of flagged outliers should be low.
The output depends on whether parameters are tuned.
If there is no tuning the output is an object of class "ssMRCD" containing the following elements:
MRCDcov | List of ssMRCD-covariance matrices sorted by neighborhood. |
MRCDicov | List of inverse ssMRCD-covariance matrices sorted by neighborhood. |
MRCDmu | List of ssMRCD-mean vectors sorted by neighborhood. |
mX | List of data matrices sorted by neighborhood. |
N | Number of neighborhoods. |
mT | Target matrix. |
rho | Vector of regularization values sorted by neighborhood. |
alpha | Scalar what percentage of observations should be used. |
h | Vector of how many observations are used per neighborhood, sorted. |
numiter | The number of iterations for the best initial h-set combination. |
c_alpha | Consistency factor for normality. |
weights | The weighting matrix. |
lambda | Smoothing factor. |
obj_fun_values | A matrix with objective function values for all initial h-set combinations (rows) and iterations (columns). |
best6pack | initial h-set combinations with best objective function value after c-step iterations. |
Kcov | returns MRCD-estimates without smoothing. |
If parameters are tuned, the output consists of:
ssMRCD | Object of class ssMRCD with optimally selected parameter lambda. |
tuning_grid | Vector of lambda to tune over given by the input. |
tuning_values | If tuning$method = "residuals" then a vector returning
the values of the residual criteria for the corresponding values of lambda in tuning_grid. |
If tuning$method = "local contamination", then matrix with false negative rates
and the total number of flagged outliers. |
|
plot | If tuning$plot = TRUE, then a plot for parameter tuning is added. |
Puchhammer P. and Filzmoser P. (2023). Spatially Smoothed Robust Covariance Estimation for Local Outlier Detection. Journal of Computational and Graphical Statistics, 33(3), 928–940. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1080/10618600.2023.2277875")}
plot.ssMRCD
# create data set
x1 = matrix(runif(200), ncol = 2)
x2 = matrix(rnorm(200), ncol = 2)
x = list(x1, x2)
# create weighting matrix
W = matrix(c(0, 1, 1, 0), ncol = 2)
# calculate ssMRCD
out = ssMRCD(X = x, weights = W, lambda = 0.5)
str(out)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.