L1median: Compute the Multivariate L1-Median aka 'Spatial Median' In robustX: 'eXtra' / 'eXperimental' Functionality for Robust Statistics

Description

Compute the multivariate L1-median m, also called “Spatial Median”, i.e., the minimizer of

sum(i = 1:n; || x[i] - m ||),

where || u || = sqrt(sum(j = 1:p; u[j]^2)).

As a convex problem, there's always a global minimizer, computable not by a closed formula but rather an iterative search. As the (partial) first derivatives of the objective function is undefined at the data points, the minimization is not entirely trivial.

Usage

 ```1 2 3 4 5``` ```L1median(X, m.init = colMedians(X), weights = NULL, method = c("nlm", "HoCrJo", "VardiZhang", optimMethods, nlminbMethods), pscale = apply(abs(centr(X, m.init)), 2, mean, trim = 0.40), tol = 1e-08, maxit = 200, trace = FALSE, zero.tol = 1e-15, ...) ```

Arguments

 `X` numeric `matrix` of dimension n x p, say. `m.init` starting value for m; typically and by default the coordinatewise median. `weights` optional numeric vector of non-negative weights; currently only implemented for method `"VardiZhang"`. `method` character string specifying the computational method, i.e., the algorithm to be used (can be abbreviated). `pscale` numeric p-vector of positive numbers, the coordinate-wise scale (typical size of delta(m[j])), where m is the problem's solution. `tol` positive number specifying the (relative) convergence tolerance. `maxit` positive integer specifying the maximal number of iterations (before the iterations are stopped prematurely if necessary). `trace` an integer specifying the tracing level of the iterations; `0` does no tracing `zero.tol` for method `"VardiZhang"`, a small positive number specifying the tolerance for determining that the iteration is ‘exactly’ at a data point (which is a singularity). `...` optional arguments to `nlm()` or the `control` (list) arguments of `optim()`, or `nlminb()`, respectively.

Details

Currently, we have to refer to the “References” below.

Value

currently the result depends strongly on the `method` used.

FIXME. This will change considerably.

Author(s)

Martin Maechler. Method `"HoCrJo"` is mostly based on Kristel Joossens' R function, implementing Hossjer and Croux (1995).

References

Hossjer and Croux, C. (1995). Generalizing Univariate Signed Rank Statistics for Testing and Estimating a Multivariate Location Parameter. Non-parametric Statistics 4, 293–308.

Vardi, Y. and Zhang, C.-H. (2000). The multivariate L_1-median and associated data depth. Proc. National Academy of Science 97(4), 1423–1426.

Fritz, H. and Filzmoser, P. and Croux, C. (2012) A comparison of algorithms for the multivariate L1-median. Computational Statistics 27, 393–410.

Kent, J. T., Er, F. and Constable, P. D. L. (2015) Algorithms for the spatial median;, in K. Nordhausen and S. Taskinen (eds), Modern Nonparametric, Robust and Multivariate Methods: Festschrift in Honour of Hannu Oja, Springer International Publishing, chapter 12, pp. 205–224. doi: 10.1007/978-3-319-22404-6_12

`median`, `covMcd`

CRAN package pcaPP added more L1 median methods, re-implementing our R versions in C++, see Fritz et al.(2012) and e.g., `l1median_NLM()`.

Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12``` ```data(stackloss) L1median(stackloss) L1median(stackloss, method = "HoCrJo") ## Explore all methods: m <- eval(formals(L1median)\$method); allMeths <- m[m != "Brent"] L1m <- sapply(allMeths, function(meth) L1median(stackloss, method = meth)) ## --> with a warning for L-BFGS-B str(L1m) pm <- sapply(L1m, function(.) if(is.numeric(.)) . else .\$par) t(pm) # SANN differs a bit; same objective ? ```

Example output

```\$minimum
[1] 253.638

\$estimate
[1] 59.03170 20.68484 86.66082 15.51665

[1]  8.071191e-10 -6.169280e-09 -7.256855e-09  5.295721e-10

\$code
[1] 1

\$iterations
[1] 12

Air.Flow Water.Temp Acid.Conc. stack.loss
59.03170   20.68484   86.66082   15.51665
Warning message:
In optim(par = m.init, fn = L1obj, method = method, control = list(parscale = pscale,  :
method L-BFGS-B uses 'factr' (and 'pgtol') instead of 'reltol' and 'abstol'
List of 9
\$ nlm        :List of 5
..\$ minimum   : num 254
..\$ estimate  : num [1:4] 59 20.7 86.7 15.5
..\$ gradient  : num [1:4] 8.07e-10 -6.17e-09 -7.26e-09 5.30e-10
..\$ code      : int 1
..\$ iterations: int 12
\$ HoCrJo     : Named num [1:4] 59 20.7 86.7 15.5
..- attr(*, "names")= chr [1:4] "Air.Flow" "Water.Temp" "Acid.Conc." "stack.loss"
\$ VardiZhang : Named num [1:4] 59 20.7 86.7 15.5
..- attr(*, "names")= chr [1:4] "Air.Flow" "Water.Temp" "Acid.Conc." "stack.loss"
..\$ par        : Named num [1:4] 59 20.7 86.7 15.5
.. ..- attr(*, "names")= chr [1:4] "Air.Flow" "Water.Temp" "Acid.Conc." "stack.loss"
..\$ value      : num 254
..\$ counts     : Named int [1:2] 151 NA
.. ..- attr(*, "names")= chr [1:2] "function" "gradient"
..\$ convergence: int 0
..\$ message    : NULL
\$ BFGS       :List of 5
..\$ par        : Named num [1:4] 59 20.7 86.7 15.5
.. ..- attr(*, "names")= chr [1:4] "Air.Flow" "Water.Temp" "Acid.Conc." "stack.loss"
..\$ value      : num 254
..\$ counts     : Named int [1:2] 30 8
.. ..- attr(*, "names")= chr [1:2] "function" "gradient"
..\$ convergence: int 0
..\$ message    : NULL
\$ CG         :List of 5
..\$ par        : Named num [1:4] 59 20.7 86.7 15.5
.. ..- attr(*, "names")= chr [1:4] "Air.Flow" "Water.Temp" "Acid.Conc." "stack.loss"
..\$ value      : num 254
..\$ counts     : Named int [1:2] 79 29
.. ..- attr(*, "names")= chr [1:2] "function" "gradient"
..\$ convergence: int 0
..\$ message    : NULL
\$ L-BFGS-B   :List of 5
..\$ par        : Named num [1:4] 59 20.7 86.7 15.5
.. ..- attr(*, "names")= chr [1:4] "Air.Flow" "Water.Temp" "Acid.Conc." "stack.loss"
..\$ value      : num 254
..\$ counts     : Named int [1:2] 9 9
.. ..- attr(*, "names")= chr [1:2] "function" "gradient"
..\$ convergence: int 0
..\$ message    : chr "CONVERGENCE: REL_REDUCTION_OF_F <= FACTR*EPSMCH"
\$ SANN       :List of 5
..\$ par        : Named num [1:4] 59.1 20.5 86.6 16.1
.. ..- attr(*, "names")= chr [1:4] "Air.Flow" "Water.Temp" "Acid.Conc." "stack.loss"
..\$ value      : num 254
..\$ counts     : Named int [1:2] 200 NA
.. ..- attr(*, "names")= chr [1:2] "function" "gradient"
..\$ convergence: int 0
..\$ message    : NULL
\$ nlminb     :List of 6
..\$ par        : Named num [1:4] 59 20.7 86.7 15.5
.. ..- attr(*, "names")= chr [1:4] "Air.Flow" "Water.Temp" "Acid.Conc." "stack.loss"
..\$ objective  : num 254
..\$ convergence: int 0
..\$ iterations : int 14
..\$ evaluations: Named int [1:2] 15 74
.. ..- attr(*, "names")= chr [1:2] "function" "gradient"
..\$ message    : chr "relative convergence (4)"
nlm  HoCrJo    VardiZhang Nelder-Mead BFGS      CG        L-BFGS-B
[1,] NULL Numeric,4 Numeric,4  Numeric,4   Numeric,4 Numeric,4 Numeric,4
SANN      nlminb
[1,] Numeric,4 Numeric,4
```

robustX documentation built on May 2, 2019, 5:16 p.m.