# rlm: Robust Fitting of Linear Models In MASS: Support Functions and Datasets for Venables and Ripley's MASS

 rlm R Documentation

## Robust Fitting of Linear Models

### Description

Fit a linear model by robust regression using an M estimator.

### Usage

``````rlm(x, ...)

## S3 method for class 'formula'
rlm(formula, data, weights, ..., subset, na.action,
method = c("M", "MM", "model.frame"),
wt.method = c("inv.var", "case"),
model = TRUE, x.ret = TRUE, y.ret = FALSE, contrasts = NULL)

## Default S3 method:
rlm(x, y, weights, ..., w = rep(1, nrow(x)),
init = "ls", psi = psi.huber,
scale.est = c("MAD", "Huber", "proposal 2"), k2 = 1.345,
method = c("M", "MM"), wt.method = c("inv.var", "case"),
maxit = 20, acc = 1e-4, test.vec = "resid", lqs.control = NULL)

psi.huber(u, k = 1.345, deriv = 0)
psi.hampel(u, a = 2, b = 4, c = 8, deriv = 0)
psi.bisquare(u, c = 4.685, deriv = 0)
``````

### Arguments

 `formula` a formula of the form `y ~ x1 + x2 + ...`. `data` an optional data frame, list or environment from which variables specified in `formula` are preferentially to be taken. `weights` a vector of prior weights for each case. `subset` An index vector specifying the cases to be used in fitting. `na.action` A function to specify the action to be taken if `NA`s are found. The ‘factory-fresh’ default action in R is `na.omit`, and can be changed by `options(na.action=)`. `x` a matrix or data frame containing the explanatory variables. `y` the response: a vector of length the number of rows of `x`. `method` currently either M-estimation or MM-estimation or (for the `formula` method only) find the model frame. MM-estimation is M-estimation with Tukey's biweight initialized by a specific S-estimator. See the ‘Details’ section. `wt.method` are the weights case weights (giving the relative importance of case, so a weight of 2 means there are two of these) or the inverse of the variances, so a weight of two means this error is half as variable? `model` should the model frame be returned in the object? `x.ret` should the model matrix be returned in the object? `y.ret` should the response be returned in the object? `contrasts` optional contrast specifications: see `lm`. `w` (optional) initial down-weighting for each case. `init` (optional) initial values for the coefficients OR a method to find initial values OR the result of a fit with a `coef` component. Known methods are `"ls"` (the default) for an initial least-squares fit using weights `w*weights`, and `"lts"` for an unweighted least-trimmed squares fit with 200 samples. `psi` the psi function is specified by this argument. It must give (possibly by name) a function `g(x, ..., deriv)` that for `deriv=0` returns psi(x)/x and for `deriv=1` returns psi'(x). Tuning constants will be passed in via `...`. `scale.est` method of scale estimation: re-scaled MAD of the residuals (default) or Huber's proposal 2 (which can be selected by either `"Huber"` or `"proposal 2"`). `k2` tuning constant used for Huber proposal 2 scale estimation. `maxit` the limit on the number of IWLS iterations. `acc` the accuracy for the stopping criterion. `test.vec` the stopping criterion is based on changes in this vector. `...` additional arguments to be passed to `rlm.default` or to the `psi` function. `lqs.control` An optional list of control values for `lqs`. `u` numeric vector of evaluation points. `k`, `a`, `b`, `c` tuning constants. `deriv` `0` or `1`: compute values of the psi function or of its first derivative.

### Details

Fitting is done by iterated re-weighted least squares (IWLS).

Psi functions are supplied for the Huber, Hampel and Tukey bisquare proposals as `psi.huber`, `psi.hampel` and `psi.bisquare`. Huber's corresponds to a convex optimization problem and gives a unique solution (up to collinearity). The other two will have multiple local minima, and a good starting point is desirable.

Selecting `method = "MM"` selects a specific set of options which ensures that the estimator has a high breakdown point. The initial set of coefficients and the final scale are selected by an S-estimator with `k0 = 1.548`; this gives (for `n \gg p`) breakdown point 0.5. The final estimator is an M-estimator with Tukey's biweight and fixed scale that will inherit this breakdown point provided `c > k0`; this is true for the default value of `c` that corresponds to 95% relative efficiency at the normal. Case weights are not supported for `method = "MM"`.

### Value

An object of class `"rlm"` inheriting from `"lm"`. Note that the `df.residual` component is deliberately set to `NA` to avoid inappropriate estimation of the residual scale from the residual mean square by `"lm"` methods.

The additional components not in an `lm` object are

 `s` the robust scale estimate used `w` the weights used in the IWLS process `psi` the psi function with parameters substituted `conv` the convergence criteria at each iteration `converged` did the IWLS converge? `wresid` a working residual, weighted for `"inv.var"` weights only.

### Note

Prior to version `7.3-52`, offset terms in `formula` were omitted from fitted and predicted values.

### References

P. J. Huber (1981) Robust Statistics. Wiley.

F. R. Hampel, E. M. Ronchetti, P. J. Rousseeuw and W. A. Stahel (1986) Robust Statistics: The Approach based on Influence Functions. Wiley.

A. Marazzi (1993) Algorithms, Routines and S Functions for Robust Statistics. Wadsworth & Brooks/Cole.

Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth edition. Springer.

`lm`, `lqs`.
``````summary(rlm(stack.loss ~ ., stackloss))