Description Usage Arguments Details Value Author(s) References See Also Examples
Compute a robust correlation estimate based on winsorization, i.e., by shrinking outlying observations to the border of the main part of the data.
1 2 3 4 5 |
x |
a numeric vector. |
y |
a numeric vector. |
type |
a character string specifying the type of
winsorization to be used. Possible values are
|
standardized |
a logical indicating whether the data are already robustly standardized. |
centerFun |
a function to compute a robust estimate
for the center to be used for robust standardization
(defaults to |
scaleFun |
a function to compute a robust estimate
for the scale to be used for robust standardization
(defaults to |
const |
numeric; tuning constant to be used in univariate or adjusted univariate winsorization (defaults to 2). |
prob |
numeric; probability for the quantile of the chi-squared distribution to be used in bivariate winsorization (defaults to 0.95). |
tol |
a small positive numeric value. This is used in bivariate winsorization to determine whether the initial estimate from adjusted univariate winsorization is close to 1 in absolute value. In this case, bivariate winsorization would fail since the points form almost a straight line, and the initial estimate is returned. |
... |
additional arguments to be passed to
|
The borders of the main part of the data are defined on
the scale of the robustly standardized data. In
univariate winsorization, the borders for each variable
are given by +/-const
, thus a symmetric
distribution is assumed. In adjusted univariate
winsorization, the borders for the two diagonally
opposing quadrants containing the minority of the data
are shrunken by a factor that depends on the ratio
between the number of observations in the major and minor
quadrants. It is thus possible to better account for the
bivariate structure of the data while maintaining fast
computation. In bivariate winsorization, a bivariate
normal distribution is assumed and the data are shrunken
towards the boundary of a tolerance ellipse with coverage
probability prob
. The boundary of this ellipse is
thereby given by all points that have a squared
Mahalanobis distance equal to the quantile of the
chi-squared distribution given by
prob
. Furthermore, the initial correlation matrix
required for the Mahalanobis distances is computed based
on adjusted univariate winsorization.
The robust correlation estimate.
Andreas Alfons, based on code by Jafar A. Khan, Stefan Van Aelst and Ruben H. Zamar
Khan, J.A., Van Aelst, S. and Zamar, R.H. (2007) Robust linear model selection based on least angle regression. Journal of the American Statistical Association, 102(480), 1289–1299.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | ## Not run:
## generate data
library("mvtnorm")
set.seed(1234) # for reproducibility
Sigma <- matrix(c(1, 0.6, 0.6, 1), 2, 2)
xy <- rmvnorm(100, sigma=Sigma)
x <- xy[, 1]
y <- xy[, 2]
## introduce outlier
x[1] <- x[1] * 10
y[1] <- y[1] * (-5)
## compute correlation
cor(x, y)
corHuber(x, y)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.