# ellipsoidhull: Compute the Ellipsoid Hull or Spanning Ellipsoid of a Point... In cluster: "Finding Groups in Data": Cluster Analysis Extended Rousseeuw et al.

 ellipsoidhull R Documentation

## Compute the Ellipsoid Hull or Spanning Ellipsoid of a Point Set

### Description

Compute the “ellipsoid hull” or “spanning ellipsoid”, i.e. the ellipsoid of minimal volume (‘area’ in 2D) such that all given points lie just inside or on the boundary of the ellipsoid.

### Usage

```ellipsoidhull(x, tol=0.01, maxit=5000,
ret.wt = FALSE, ret.sqdist = FALSE, ret.pr = FALSE)
## S3 method for class 'ellipsoid'
print(x, digits = max(1, getOption("digits") - 2), ...)
```

### Arguments

 `x` the n p-dimensional points asnumeric n x p matrix. `tol` convergence tolerance for Titterington's algorithm. Setting this to much smaller values may drastically increase the number of iterations needed, and you may want to increas `maxit` as well. `maxit` integer giving the maximal number of iteration steps for the algorithm. `ret.wt, ret.sqdist, ret.pr` logicals indicating if additional information should be returned, `ret.wt` specifying the weights, `ret.sqdist` the squared distances and `ret.pr` the final probabilities in the algorithms. `digits,...` the usual arguments to `print` methods.

### Details

The “spanning ellipsoid” algorithm is said to stem from Titterington(1976), in Pison et al (1999) who use it for `clusplot.default`.
The problem can be seen as a special case of the “Min.Vol.” ellipsoid of which a more more flexible and general implementation is `cov.mve` in the `MASS` package.

### Value

an object of class `"ellipsoid"`, basically a `list` with several components, comprising at least

 `cov` p x p covariance matrix description the ellipsoid. `loc` p-dimensional location of the ellipsoid center. `d2` average squared radius. Further, d2 = t^2, where t is “the value of a t-statistic on the ellipse boundary” (from `ellipse` in the ellipse package), and hence, more usefully, `d2 = qchisq(alpha, df = p)`, where `alpha` is the confidence level for p-variate normally distributed data with location and covariance `loc` and `cov` to lie inside the ellipsoid. `wt` the vector of weights iff `ret.wt` was true. `sqdist` the vector of squared distances iff `ret.sqdist` was true. `prob` the vector of algorithm probabilities iff `ret.pr` was true. `it` number of iterations used. `tol, maxit` just the input argument, see above. `eps` the achieved tolerance which is the maximal squared radius minus p. `ierr` error code as from the algorithm; `0` means ok. `conv` logical indicating if the converged. This is defined as `it < maxit && ierr == 0`.

### Author(s)

Martin Maechler did the present class implementation; Rousseeuw et al did the underlying original code.

### References

Pison, G., Struyf, A. and Rousseeuw, P.J. (1999) Displaying a Clustering with CLUSPLOT, Computational Statistics and Data Analysis, 30, 381–392.

D.M. Titterington (1976) Algorithms for computing D-optimal design on finite design spaces. In Proc.\ of the 1976 Conf.\ on Information Science and Systems, 213–216; John Hopkins University.

`predict.ellipsoid` which is also the `predict` method for `ellipsoid` objects. `volume.ellipsoid` for an example of ‘manual’ `ellipsoid` object construction;
further `ellipse` from package ellipse and `ellipsePoints` from package sfsmisc.

`chull` for the convex hull, `clusplot` which makes use of this; `cov.mve`.

### Examples

```x <- rnorm(100)
xy <- unname(cbind(x, rnorm(100) + 2*x + 10))
exy. <- ellipsoidhull(xy)
exy. # >> calling print.ellipsoid()

plot(xy, main = "ellipsoidhull(<Gauss data>) -- 'spanning points'")
lines(predict(exy.), col="blue")
points(rbind(exy.\$loc), col = "red", cex = 3, pch = 13)

exy <- ellipsoidhull(xy, tol = 1e-7, ret.wt = TRUE, ret.sq = TRUE)
str(exy) # had small 'tol', hence many iterations
(ii <- which(zapsmall(exy \$ wt) > 1e-6))
## --> only about 4 to 6  "spanning ellipsoid" points
round(exy\$wt[ii],3); sum(exy\$wt[ii]) # weights summing to 1
points(xy[ii,], pch = 21, cex = 2,