hdepth | R Documentation |
Computes the halfspace depth of p
-dimensional points z
relative to a p
-dimensional dataset x
. Computation is exact for p \le 3
and approximate when p > 3
. For the approximate algorithm the halfspace depth is computed as the minimal univariate halfspace depth over many directions. To obtain the univariate halfspace depth in the direction v
, the dataset x
is projected on v
, and the univariate location depth of the points of v'z_i
to xv
is computed.
hdepth(x, z = NULL, options = list())
x |
An |
z |
An optional |
options |
A list of available options:
|
Halfspace depth has been introduced by Tukey (1975). The halfspace depth of a point z_i
is defined as the minimal number of observations from x
that are contained in any closed halfspace with boundary through z_i
.
In dimensions p=2
and p=3
the computations are by default carried out exactly using the algorithms described in Rousseeuw and Ruts (1996) and Rousseeuw and Struyf (1998). This yields an affine invariant measure of depth.
Approximate algorithms are also implemented which are affine, rotation or shift invariant, depending on the value chosen for type
. They can be used in any dimension. The shift invariant algorithm coincides with the random Tukey depth (Cuesta-Albertos and Nieto-Reyes, 2008).
The resulting halfspace depth values are invariant to affine transformations when the exact algorithm is used and invariant to affine transformations, rotations and shifts depending on the choice for type
, provided that the seed
is kept fixed at different runs of the algorithm. Note that the halfspace depth values values are guaranteed to decrease when more directions are considered, provided the seed is kept fixed, as this ensures that the random directions are generated in a fixed order.
If the halfspace depth needs to be computed for m
points z_i
, it is recommended to apply the function once with the matrix z
as input, instead of applying it m
times with input vectors z_i
, as numerous computations can be saved. The approximate algorithms automatically then also compute the depth values of the observations in x
.
When only the halfspace depth of the observations in x
is required, the call to the function should be hdepth(x)
or equivalently hdepth(x,x)
. In that case the depth values will be stored in the 'depthZ' output field. For bivariate data these will be the exact values by default.
To visualize the depth of bivariate data one can apply the mrainbowplot
function. It plots the data colored according to their depth.
It is first checked whether the data lie in a subspace of dimension smaller than p
. If so, a warning is given, as well as the dimension of the subspace and a direction which is orthogonal to it.
A list with components:
depthX |
Vector of length |
depthZ |
Vector of length |
singularSubsets |
When the input parameter type is equal to |
dimension |
When the data |
hyperplane |
When the data |
P. Segaert based on Fortran code by P.J. Rousseeuw, I. Ruts and A. Struyf, and C++
code by P. Segaert and K. Vakili.
Tukey J. (1975). Mathematics and the picturing of data. Proceedings of the International Congress of Mathematicians, 2, 523–531, Vancouver.
Rousseeuw P.J., Ruts I. (1996). AS 307: Bivariate location depth. Journal of the Royal Statistical Society: Series C, 45, 516–526.
Rousseeuw P.J., Struyf A. (1998). Computing location depth and regression depth in higher dimensions. Statistics and Computing, 8, 193–203.
Cuesta-Albertos J., Nieto-Reyes A. (2008). The random Tukey depth. Computational Statistics & Data Analysis, 52, 4979–4988.
hdepthmedian
, mrainbowplot
, bagdistance
, bagplot
# Compute the halfspace depth of a simple
# two-dimensional dataset.
data(cardata90)
Result <- hdepth(x = cardata90)
mrainbowplot(cardata90, depths = Result$depthZ)
# In two dimensions we may also opt to use the
# approximate algorithm. The number of directions
# may be specified through the option list.
options <- list(type = "Rotation",
ndir = 750,
approx = TRUE)
Result <- hdepth(x = cardata90, options = options)
# The resulting halfspace depth is monotone decreasing
# in the number of directions.
options <- list(type = "Rotation",
ndir = 10,
approx = TRUE)
Result1 <- hdepth(x = cardata90, options = options)
options <- list(type = "Rotation",
ndir = 500,
approx = TRUE)
Result2 <- hdepth(x = cardata90, options = options)
which(Result1$depthZ - Result2$depthZ < 0)
# This is however not the case when the seed is changed
options <- list(type = "Rotation",
ndir = 10,
approx = TRUE)
Result1 <- hdepth(x = cardata90, options = options)
options <- list(type = "Rotation",
ndir = 50,
approx = TRUE,
seed = 897)
Result2 <- hdepth(x = cardata90, options = options)
which(Result1$depthZ - Result2$depthZ < 0)
plot(Result1$depthZ - Result2$depthZ,
xlab = "Index", ylab = "Difference in halfspace depth")
# We can also consider directions through two data
# points. If the sample is small enough one may opt
# to search over all choose(n,2) directions.
# Note that the computational load increases substantially
# as n becomes larger.
options <- list(type = "Rotation",
ndir = "all",
approx = TRUE)
Result1 <- hdepth(x = cardata90, options = options)
# Alternatively one may consider randomly generated directions.
options <- list(type = "Shift",
ndir = 250,
approx = TRUE)
Result1 <- hdepth(x = cardata90, options = options)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.