# plot.lm: Plot Diagnostics for an lm Object

## Description

Six plots (selectable by `which`) are currently available: a plot of residuals against fitted values, a Scale-Location plot of sqrt(| residuals |) against fitted values, a Normal Q-Q plot, a plot of Cook's distances versus row labels, a plot of residuals against leverages, and a plot of Cook's distances against leverage/(1-leverage). By default, the first three and `5` are provided.

## Usage

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17``` ```## S3 method for class 'lm' plot(x, which = c(1,2,3,5), caption = list("Residuals vs Fitted", "Normal Q-Q", "Scale-Location", "Cook's distance", "Residuals vs Leverage", expression("Cook's dist vs Leverage " * h[ii] / (1 - h[ii]))), panel = if(add.smooth) function(x, y, ...) panel.smooth(x, y, iter=iter.smooth, ...) else points, sub.caption = NULL, main = "", ask = prod(par("mfcol")) < length(which) && dev.interactive(), ..., id.n = 3, labels.id = names(residuals(x)), cex.id = 0.75, qqline = TRUE, cook.levels = c(0.5, 1.0), add.smooth = getOption("add.smooth"), iter.smooth = if(isGlm) 0 else 3, label.pos = c(4,2), cex.caption = 1, cex.oma.main = 1.25) ```

## Arguments

 `x` `lm` object, typically result of `lm` or `glm`. `which` if a subset of the plots is required, specify a subset of the numbers `1:6`, see `caption` below (and the ‘Details’) for the different kinds. `caption` captions to appear above the plots; `character` vector or `list` of valid graphics annotations, see `as.graphicsAnnot`, of length 6, the j-th entry corresponding to `which[j]`. Can be set to `""` or `NA` to suppress all captions. `panel` panel function. The useful alternative to `points`, `panel.smooth` can be chosen by `add.smooth = TRUE`. `sub.caption` common title—above the figures if there are more than one; used as `sub` (s.`title`) otherwise. If `NULL`, as by default, a possible abbreviated version of `deparse(x\$call)` is used. `main` title to each plot—in addition to `caption`. `ask` logical; if `TRUE`, the user is asked before each plot, see `par(ask=.)`. `...` other parameters to be passed through to plotting functions. `id.n` number of points to be labelled in each plot, starting with the most extreme. `labels.id` vector of labels, from which the labels for extreme points will be chosen. `NULL` uses observation numbers. `cex.id` magnification of point labels. `qqline` logical indicating if a `qqline()` should be added to the normal Q-Q plot. `cook.levels` levels of Cook's distance at which to draw contours. `add.smooth` logical indicating if a smoother should be added to most plots; see also `panel` above. `iter.smooth` the number of robustness iterations, the argument `iter` in `panel.smooth()`; the default uses no such iterations for `glm` fits which is particularly desirable for the (predominant) case of binary observations, but also for other models where the response distribution can be highly skewed. `label.pos` positioning of labels, for the left half and right half of the graph respectively, for plots 1-3. `cex.caption` controls the size of `caption`. `cex.oma.main` controls the size of the `sub.caption` only if that is above the figures when there is more than one.

## Details

`sub.caption`—by default the function call—is shown as a subtitle (under the x-axis title) on each plot when plots are on separate pages, or as a subtitle in the outer margin (if any) when there are multiple plots per page.

The ‘Scale-Location’ plot, also called ‘Spread-Location’ or ‘S-L’ plot, takes the square root of the absolute residuals in order to diminish skewness (sqrt(|E|) is much less skewed than | E | for Gaussian zero-mean E).

The ‘S-L’, the Q-Q, and the Residual-Leverage plot, use standardized residuals which have identical variance (under the hypothesis). They are given as R[i] / (s * sqrt(1 - h.ii)) where h.ii are the diagonal entries of the hat matrix, `influence()\$hat` (see also `hat`), and where the Residual-Leverage plot uses standardized Pearson residuals (`residuals.glm(type = "pearson")`) for R[i].

The Residual-Leverage plot shows contours of equal Cook's distance, for values of `cook.levels` (by default 0.5 and 1) and omits cases with leverage one with a warning. If the leverages are constant (as is typically the case in a balanced `aov` situation) the plot uses factor level combinations instead of the leverages for the x-axis. (The factor levels are ordered by mean fitted value.)

In the Cook's distance vs leverage/(1-leverage) plot, contours of standardized residuals (`rstandard(.)`) that are equal in magnitude are lines through the origin. The contour lines are labelled with the magnitudes.

Notice that some plots may not make much sense for the `glm` case; e.g., the normal Q-Q plot only makes sense if the distribution is approximately normal.

## Author(s)

John Maindonald and Martin Maechler.

## References

Belsley, D. A., Kuh, E. and Welsch, R. E. (1980). Regression Diagnostics. New York: Wiley.

Cook, R. D. and Weisberg, S. (1982). Residuals and Influence in Regression. London: Chapman and Hall.

Firth, D. (1991) Generalized Linear Models. In Hinkley, D. V. and Reid, N. and Snell, E. J., eds: Pp. 55-82 in Statistical Theory and Modelling. In Honour of Sir David Cox, FRS. London: Chapman and Hall.

Hinkley, D. V. (1975). On power transformations to symmetry. Biometrika, 62, 101–111. \Sexpr[results=rd,stage=build]{tools:::Rd_expr_doi("10.2307/2334491")}.

McCullagh, P. and Nelder, J. A. (1989). Generalized Linear Models. London: Chapman and Hall.

`termplot`, `lm.influence`, `cooks.distance`, `hatvalues`.
 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25``` ```require(graphics) ## Analysis of the life-cycle savings data ## given in Belsley, Kuh and Welsch. lm.SR <- lm(sr ~ pop15 + pop75 + dpi + ddpi, data = LifeCycleSavings) plot(lm.SR) ## 4 plots on 1 page; ## allow room for printing model formula in outer margin: par(mfrow = c(2, 2), oma = c(0, 0, 2, 0)) plot(lm.SR) plot(lm.SR, id.n = NULL) # no id's plot(lm.SR, id.n = 5, labels.id = NULL) # 5 id numbers ## Was default in R <= 2.1.x: ## Cook's distances instead of Residual-Leverage plot plot(lm.SR, which = 1:4) ## Fit a smooth curve, where applicable: plot(lm.SR, panel = panel.smooth) ## Gives a smoother curve plot(lm.SR, panel = function(x, y) panel.smooth(x, y, span = 1)) par(mfrow = c(2,1)) # same oma as above plot(lm.SR, which = 1:2, sub.caption = "Saving Rates, n=50, p=5") ```