resindexplot: Plots the residuals from a regression analysis versus index...

View source: R/resindexplot.R

resindexplotR Documentation

Plots the residuals from a regression analysis versus index number or any other variable

Description

The function resindexplot() plots the residuals from a regression analysis versus index number or any other variable. The residuals come from an output object of any of the regression fucntions or a simply a vector of values. In order to use the databrush option, the residuals must come from one of the fsdaR regression functions.

Usage

resindexplot(out, x, xlim, ylim, xlab, ylab, main, numlab, indlab, conflev, cex.axis, 
    cex.lab, lwd, nameX, namey, tag, col, cex, databrush, ...)

Arguments

out

A vector containing the residuals from a regression analysis or an object returned by one of the regression functions (see FSR_control, LXS_control, Sreg_control and MMreg_control). The object is one of fsr.object, fsdalts.object, fsdalms.object, sreg.object or mmreg.object. The needed elements of out are at least residuals, but if the option databrush is used, also X amd y will be needed.

x

The vector to be plotted on the x-axis. As default the sequence 1:length(residuals) will be used

xlim

Control x scale in plot. Vector with two elements controlling minimum and maximum on the x axis. Default is to use automatic scale.

ylim

Control y scale in plot. Vector with two elements controlling minimum and maximum on the y axis. Default is to use automatic scale.

xlab

a title for the x axis

ylab

a title for the y axis

main

an overall title for the plot

numlab

Number of points to be identified in plots (see also indlab) . By default the five points with largest values will be identified. If numlab is a single number containing scalar k, the units with the k largest residuals are labelled in the plots. If numlab is a vector, the units inside vector numlab are labelled in the plots. The default value of numlab=5 and the units with the 5 largest residuals will be labelled. If numlib=0 or numlib=NULL no labelling will be done.

indlab

Which points to be identified in plots (see also numlab) - the units with indexes in the vector indlab are labelled in the plots.

conflev

Confidence interval for the horizontal bands (a numeric vector). It can be a vector of different confidence level values.

Remark: confidence interval is based on the chi^2 distribution

cex.axis

The magnification to be used for axis annotation relative to the current setting of cex

cex.lab

The magnification to be used for x and y labels relative to the current setting of cex

lwd

The line width, a positive number, defaulting to 1

tag

Figure tag (character). Tag of the figure which will host the resindexplot. The default tag iscodepl_resindex.

col

Fill color for markers that are closed shapes (circle, square, diamond, pentagram, hexagram, and the four triangles). Can be 'none' or 'auto' or color name(string) or RGB triplet.

cex

Size of the point symbols. The magnification to be used relative to the current setting of cex.

nameX

Add variable labels in plot. A vector of strings of length p containing the labels of the variables of the regression dataset. If it is empty (default) the sequence X1, ..., Xp will be created automatically

namey

Add response label. A string containing the label of the response

databrush

Interactive mouse brushing. If databrush is missing or empty (default) or databrush=FALSE, no brushing is done. The activation of this option (databrush is a scalar or a list) enables the user to select a set of trajectories in the current plot and to see them highlighted in the y|X plot, i.e. a matrix of scatter plots of y against each column of X, grouped according to the selection(s) done by brushing. If the plot y|X does not exist it is automatically created. In addition, brushed units are automatically highlighted in the minimum deletion residual plot if it is already open. The extension to the following plots will be available in future versions of the package:

  1. monitoring leverage plot;

  2. maximum studentized residual;

  3. s^2 and R^2;

  4. Cook distance and modified Cook distance;

  5. deletion t statistics.

Note that the window style of the other figures is set equal to that which contains the monitoring residual plot. In other words, if the monitoring residual plot is docked all the other figures will be docked too

If databrush=TRUE the default selection tool is a rectangular brush and it is possible to brush only once (that is persist=”).

If databrush=list(...), it is possible to use all optional arguments of function selectdataFS() and the following optional argument:

  1. persist: Persist is an empty value or a character containing 'on' or 'off'. The default value is persist="", that is brushing is allowed only once. If persist="on" or persis="off" brushing can be done as many time as the user requires. If persist='on' then the unit(s) currently brushed are added to those previously brushed. It is possible, every time a new brushing is done, to use a different color for the brushed units. If persist='off' every time a new brush is performed units previously brushed are removed.

  2. bivarfit: This option adds one or more least square lines based on SIMPLE REGRESSION to the plots of y|X, depending on the selected groups. The default is bivarfit=FALSE: no line is fitted. If bivarfit=1, a single OLS line is fitted to all points of each bivariate plot in the scatter matrix y|X. If bivarfit=2, two OLS lines are fitted: one to all points and another to the group of the genuine observations. The group of the potential outliers is not fitted. If bivarfit=0 one OLS line is fitted to each group. This is useful for the purpose of fitting mixtures of regression lines. If bivarfit='i1' or bivarfit='i2', etc. an OLS line is fitted to a specific group, the one with index 'i' equal to 1, 2, 3 etc. Again, useful in case of mixtures.

  3. multivarfit: Wheather to superimpose multivariate least square lines. This option adds one or more least square lines, based on MULTIVARIATE REGRESSION of y on X, to the plots of y|Xi. The default is multivarfit=FALSE: no line is fitted. If bivarfit=1, a single OLS line is fitted to all points of each bivariate plot in the scatter matrix y|X. The line added to the scatter plot y|Xi is avconst + Ci*Xi, where Ci is the coefficient of Xi in the multivariate regression and avconst is the effect of all the other explanatory variables different from Xi evaluated at their centroid (that is overline(y)'C)). If multivarfit=2, same action as with multivarfit=1 but this time we also add the line based on the group of unselected observations (i.e. the normal units).

  4. labeladd: Add outlier labels in plot. If labeladd=TRUE, we label the outliers with the unit row index in matrices X and y. The default value is labeladd=FALSE, i.e. no label is added.

...

potential further arguments passed to lower level functions.

Details

No details

Value

No value returned

Author(s)

FSDA team

Examples

## Not run: 
out <- fsreg(stack.loss~., data=stackloss)
resindexplot(out, conflev=c(0.95,0.99), col="green")

## End(Not run)

fsdaR documentation built on March 31, 2023, 8:18 p.m.