plregr: Diagnostic Plots for Regr Objects

plregrR Documentation

Diagnostic Plots for Regr Objects

Description

Diagnostic plots for fitted regression models: Residuals versus fit (Tukey-Anscombe plot) and/or target variable versus fit; Absolute residuals versus fit to assess equality of error variances; Normal Q-Q plot (for ordinary regression models); Residuals versus leverages to identify influential observations; Residuals versus sequence (if requested); and residuals versus explanatory variables. These plots are adjusted to the type of regression model.

Usage

plregr(x, data = NULL, plotselect = NULL, xvar = TRUE,
  transformed = NULL, sequence = FALSE, weights = NULL,
  addcomp = NULL, smooth = 2, smooth.legend = FALSE, markextremes = NA,
  plargs = NULL, ploptions = NULL, assign = TRUE, ...)

plresx(x, data = NULL, xvar = TRUE, transformed = NULL,
  sequence = FALSE, weights = NULL,
  addcomp = NULL, smooth = 2, smooth.legend = FALSE, markextremes = NA,
  plargs = NULL, ploptions = NULL, assign = TRUE, ...)

Arguments

x

"regr" (or also lm or glm) object, result of a call to regr() from package regr. This is the only argument needed. All others have useful defaults.

data

data set where explanatory variables and the following possible arguments are found: weights, plweights, pch, plabs

plotselect

which plots should be shown? See Details

xvar

if TRUE, residuals will be plotted versus all explanatory variables (or terms, according to argument 'transformed') in the model (plregr will call plresx).
If it is a character vector, it contains the variables to be used.
If it is a formula, its right hand side contains these variables. The model formula is updated by such a formula. Whence, the use of \~{}.+ adds variables to those in the model.
If any variables are not be contained in the model, the argument data is needed.

transformed

logical: should residuals be shown against transformed explanatory variables? If TRUE, the variables are transformed as implied by the model.

sequence

if TRUE, residuals will be plotted versus the sequence as they appear in the data. If another explanatory variable is monotone increasing or decreasing, the plot is not shown, but a warning is given.

weights

if TRUE, residuals will be plotted versus x$weights. Alternatively, a vector of weights can be specified

addcomp

logical: should component effects be added to residuals for residuals versus input variables plots?

smooth

logical: should a smooth line be added?

smooth.legend

When a grouping factor is used (argument smooth.group, see below), this argument determines whether and where the legend for identifying the groups should be shown, see Details

markextremes

proportion of extreme residuals to be labeled. If all points should be labeled, let markextremes=1.

plargs

result of calling pl.control. If NULL, pl.control will be called to generate it. If not null, arguments given in ... will be ignored.

ploptions

list of pl options.

assign

logical: Should the plargs be stored in the pl.envir environment?

...

Many further arguments are available to customize the plots, see below for some of the most useful ones, and plregr.control for a complete list.

Details

Argument plotselect is used to determine which plots will be shown. It should be a named vector of numbers indicating

0

do not show

1

show without smooth

2

show with smooth (not for qq nor leverage)

3

show with smooth and smooth band (only for resfit in plregr and in plresx)

The default is c( yfit=0, resfit=smdef, absresfit = NA, absresweights = NA, qq = NA, leverage = 2, resmatrix = 1, qqmult = 3), where smdef is 3 (actually argument smooth of plregr.control plus 1) for normal random deviations and one less (no band) for others.

Modify this vector to change the selection and the sequence in which the plots appear. Alternatively, provide a named vector defining all plots that should be shown on a different level than the default indicates, like plotselect = c(resfit = 2, leverage = 1). Adding an element default = 0 suppresses all plots not mentioned. This is useful to select single plots, like plotselect = c(resfit = 3, default = 0)

The names of plotselect refer to:

yfit

response versus fitted values

resfit

residuals versus fitted values (Tukey-Anscombe plot)

absresfit

residuals versus fitted values, defaults to TRUE for ordinary regression, FALSE for glm and others

absresweights

residuals versus weights

qq

normal Q-Q plot, defaults to TRUE for ordinary regression, FALSE for glm and others

leverage

residuals versus leverage (hat diabgonal)

resmatrix

scatterplot matrix of residuals for multivariate regression

qqmult

qq plot for Mahlanobis lengths versus sqrt of chisquare quantiles.

In the 'resfit' (Tukey-Anscombe) plot, the reference line indicates a "contour" line with constant values of the response variable, Y=\widehat y+r= constant. It has slope -1. It is useful to judge whether any curvature shown by the smooth might disappear after a nonlinear, monotone transformation of the response.

If smresid is true, the 'absresfit' plot uses modified residuals: differences between the ordinary residuals and the smooth appearing in the 'resfit' plot. Analogously, the 'qq' plot is then based on yet another modification of these modified residuals: they are scaled by the smoothed scale shown in the 'absresfit' plot, after these scales have been standardized to have a median of 0.674 (=qnorm(0.75)).

The smoothing function used by default is smoothRegr, which calls loess. This can be changed by setting ploptions(smooth.function=<func>), which must have the same arguments as smoothRegr.

The arguments lty, lwd, colors characterize how the graphical elements in the plot are shown. They should be three vectors of length 9 each, defining the line types, line widths, and colors to be used for ...

[1]

observations;

[2]

reference lines;

[3]

smooth;

[4]

simulated smooths;

[5]

component effects in plresx;

[6]

confidence bands of component effects.

In the case of glm.restype="cond.quant"

[7]

(random) observations;

[8]

conditional medians;

[9]

bars showing conditional quantiles.

If smooths are shown according to groups (given in smooth.group), then a legend can be required and positioned in the respecive plots by using the argument smooth.legend. If it is TRUE, then the legend will be placed in the "bottomright" corner. Alternatively, the corner can be specified as "bottomright", "bottomleft", "topleft", or "topright". A coordinate pair may also be given. These possibilities can be used individually for each plot by giving a named vector or a named list, where the names are one of "yfit", "resfit", "absresfit", "absresweight", ".xvar." or names of x variables provided by the xvar argument. A component ".xvar." selects the first x variable.

There is an hidden argument innerrange.fit that allows for fixing an inner range for plotting the fitted values.

Value

The list of the evaluations of all arguments and some more useful items is returned invisibly.

Note

This is a function under development. Future versions may behave differently and may not be compatible with this version.

Author(s)

Werner A. Stahel, ETH Zurich

See Also

plregr.control, plot.lm

Examples

data(LifeCycleSavings, package="datasets")
r.savings <- lm(sr ~ pop15 + pop75 + dpi + ddpi, data = LifeCycleSavings)
plregr(r.savings)

## --- *transformed* linear model
data(d.blast)
r.blast <-
     lm(log10(tremor) ~ location+log10(distance)+log10(charge),
          data=d.blast)
plregr(r.blast, sequence=TRUE, transformed=TRUE)
plregr(r.blast, xvar=FALSE, innerrange.fit=c(0.3,1.2))


## --- multivariate regression
data(d.fossileSamples)
r.foss <-
  lm(cbind(sAngle,lLength,rWidth) ~ SST+Salinity+lChlorophyll+Region+N,
  data=d.fossileSamples)
plregr(r.foss, plotselect=c(resfit=3, resmatrix=1, qqmult=1))


## --- logistic regression
data(d.babysurvival)
rr <- glm(Survival ~ Weight+Age+Apgar1, data=d.babysurvival, family=binomial)
plregr(rr, xvar= ~Weight, cex.plab=0.7, ylim=c(-5,5))
plregr(rr, condquant=FALSE)

## --- ordinal regression
if(requireNamespace("MASS")) {
data(housing, package="MASS")
rr <- MASS::polr(Sat ~ Infl + Type + Cont, weights = Freq, data = housing)
plregr(rr, factor.show="jitter")
}

plgraphics documentation built on Oct. 19, 2023, 3 p.m.