TRC: Transformed rank correlations for multivariate outlier...

View source: R/TRC.R

TRCR Documentation

Transformed rank correlations for multivariate outlier detection

Description

TRC starts from bivariate Spearman correlations and obtains a positive definite covariance matrix by back-transforming robust univariate medians and mads of the eigenspace. TRC can cope with missing values by a regression imputation using the a robust regression on the best predictor and it takes sampling weights into account.

Usage

TRC(
  data,
  weights,
  overlap = 3,
  mincor = 0,
  robust.regression = "rank",
  gamma = 0.5,
  prob.quantile = 0.75,
  alpha = 0.05,
  md.type = "m",
  monitor = FALSE
)

Arguments

data

a data frame or matrix with the data.

weights

sampling weights.

overlap

minimum number of jointly observed values for calculating the rank correlation.

mincor

minimal absolute correlation to impute.

robust.regression

type of regression: "irls" is iteratively reweighted least squares M-estimator, "rank" is based on the rank correlations.

gamma

minimal number of jointly observed values to impute.

prob.quantile

if mads are 0, try this quantile of absolute deviations.

alpha

(1 - alpha) Quantile of F-distribution is used for cut-off.

md.type

type of Mahalanobis distance when missing values occur: "m" marginal (default), "c" conditional.

monitor

if TRUE, verbose output.

Details

TRC is similar to a one-step OGK estimator where the starting covariances are obtained from rank correlations and an ad hoc missing value imputation plus weighting is provided.

Value

TRC returns a list whose first component output is a sublist with the following components:

sample.size

Number of observations

number.of.variables

Number of variables

number.of.missing.items

Number of missing values

significance.level

1 - alpha

computation.time

Elapsed computation time

medians

Componentwise medians

mads

Componentwise mads

center

Location estimate

scatter

Covariance estimate

robust.regression

Input parameter

md.type

Input parameter

cutpoint

The default threshold MD-value for the cut-off of outliers

The further components returned by TRC are:

outind

Indicator of outliers

dist

Mahalanobis distances (with missing values)

Author(s)

Beat Hulliger

References

Béguin, C. and Hulliger, B. (2004) Multivariate outlier detection in incomplete survey data: the epidemic algorithm and transformed rank correlations, JRSS-A, 167, Part 2, pp. 275-294.

Examples

data(bushfirem, bushfire.weights)
det.res <- TRC(bushfirem, weights = bushfire.weights)
PlotMD(det.res$dist, ncol(bushfirem))
print(det.res)

modi documentation built on March 31, 2023, 8:35 p.m.