regspmplot: Interactive scatterplot matrix for regression

View source: R/regspmplot.R

regspmplotR Documentation

Interactive scatterplot matrix for regression

Description

Produces an interactive scatterplot of the responce y against each variable of the predictor matrix X.

Usage

regspmplot(
  y,
  X,
  group,
  plot,
  namey,
  nameX,
  col,
  cex,
  pch,
  labeladd,
  legend,
  xlim,
  ylim,
  tag,
  datatooltip,
  databrush,
  subsize,
  selstep,
  selunit,
  trace = FALSE,
  ...
)

Arguments

y

responce variable or an object containing the responce, the predictors and possibly other variable resulting from monitoring of regression.

If y is a vector, a data matrix X must be present as an argument If y is a list containing just y and X, the call is equivallent to regspmplot(y, X). Otherwise y must be an an object of S3 class fsreda.object returned by fsreg with monitoring=TRUE - a list containing the monitoring along a search

X

Predictor variables. Data matrix of explanatory variables (also called 'regressors') of dimension n by p if the argument y is a vector. The rows of X represent observations, and the columns represent variables.

group

grouping variable. Vector with n elements. Specifies a grouping variable defined as a categorical variable (factor), numeric, or array of strings, or string matrix, and it must have the same number of rows as X. This grouping variable determines the marker and color assigned to each point. Remark: if group is used to distinguish a set of outliers from a set of good units, the id number for the outliers should be the larger (see optional field labeladd of parameter plo for details).

plot

This option controls the names which are displayed in the margins of the scatterplot matrix as well as the labels of the legend. If plot=FALSE, then namey, nameX and labeladd are both set to the empty string (default), and no label and no name is added to the plot. If plot=TRUE the names y, and X1,..., Xp are added to the margins of the the scatter plot matrix else nothing is shown. If plot is a list, it is possible to control not only the names but also, point labels, colors and symbols. More precisely list plot may contain the following elements:

  1. labeladd - see parameter labeladd

  2. namey - a character string containing the response variable name. See parameter namey.

  3. nameX - a vector of character strings containing the labels of the explanatory variables. As default value, the labels which are added are Y1, ..., Yp. See parameter nameX.

  4. clr - see parameter col

  5. sym - see parameter pch

  6. siz - see parameter cex

  7. doleg - see parameter legend

  8. xlimx - see parameter xlim

  9. ylimy - see parameter ylim

namey

a character string with the name of the responce variable

nameX

a vector of character strings with the names of the explanatory variables

col

color specification for the data point. Can be different for each group. By default, the order of the colors is blue, red, black, magenta, green, cyan and yelow.

cex

the size of the symbols used for plotting. By default cex=1 the symbol size depends on the number of plots and the size of the figure window. Values larger than 1 will increase the size and values smaller than 1 will decrease the size.

pch

specification of the symbols to use. For example, if there are three groups, and pch=c(1, 3, 4), the first group will be plotted with a circle, the second with a plus, and the third with a 'x' (see ?pch or ?points for a list of symbols. NOTE: not all symbols available in R can be mapped to the symbols in MATLAB.

labeladd

logical, controls wheather the elements belonging to the last group in the scatterplot matrix are labelled with their unit row index or their rowname. The rowname is taken from the parameter label or if it is missing, from the sequence 1:n. The default value is labeladd=FALSE, i.e. no label is added.

legend

logical, controls where a legend is shown or not.

xlim

x limits. A vector with two elements controlling minimum and maximum on the x axis. By defaul automatic scale is used.

ylim

y limits. A vector with two elements controlling minimum and maximum on the y axis. By defaul automatic scale is used.

tag

Plot handle. String which identifies the handle of the plot which is about to be created. The default is tag='pl_mmd'. Notice that if the program finds a plot which has a tag equal to the one specified by the user, then the output of the new plot overwrites the existing one in the same window else a new window is created.

datatooltip

If datatooltip is not empty the user can use the mouse in order to have information about the unit selected, the step in which the unit enters the search and the associated label. If datatooltip is a list, it is possible to control the aspect of the data cursor (see MATLAB function datacursormode() for more details or see the examples below). The default options are DisplayStyle="Window" and SnapToDataVertex="on".

databrush

Interactive mouse brushing. If databrush is missing or empty (default), no brushing is done. The activation of this option (databrush is TRUE or a list) enables the user to select a set of trajectories in the current plot and to see them highlighted in the scatterplot matrix. If the scatterplot matrix does not exist it is automatically created. In addition, brushed units can be highlighted in the monitoring MD plot. Note that the window style of the other figures is set equal to that which contains the monitoring residual plot. In other words, if the monitoring residual plot is docked all the other figures will be docked too.

If databrush=TRUE the default selection tool is a rectangular brush and it is possible to brush only once (that is persist=”).

Note that the window style of the other figures is set equal to that which contains the monitoring residual plot. In other words, if the monitoring residual plot is docked all the other figures will be docked too

If databrush=TRUE the default selection tool is a rectangular brush and it is possible to brush only once (that is persist=”).

If databrush=list(...), it is possible to use all optional arguments of the MATLAB function selectdataFS() and the following optional arguments:

  • persist: This option can be an empty value or a character containing 'on' or 'off'. The default value is persist="", that is brushing is allowed only once. If persist="on" or persis="off" brushing can be done as many time as the user requires. If persist='on' then the unit(s) currently brushed are added to those previously brushed. It is possible, every time a new brushing is done, to use a different color for the brushed units. If persist='off' every time a new brush is performed units previously brushed are removed.

  • labeladd: add labels of brushed units in the scatterplot matrix. If this option is '1', we label the units of the last selected group with the unit row index in the matrix X. The default value is labeladd=”, i.e. no label is added.

subsize

x axis control, a numeric vector containing the subset size with length equal to the number of columns of matrix residuals. If it is not specified it will be set equal to (nrow(residuals) - ncol(residuals) + 1) : nrow(residuals).

selstep

Text shown in selected steps, a numeric vector which specifies for which steps of the forward search textlabels are added in the monitoring residual plot after a brushing action in the yXplot. The default is to write the labels at the initial and final step. The default is selstep=c(m0, n) where m0 and n are respectively the first and final step of the search.

selunit

Unit labelling. A vector of strings, a string, or a numeric vector for labelling units. If out is an object the threshold is associated with the trajectories of the residuals monitored along the search else it refers to the values of the response variable. If it is a vector of strings, only the lines associated with the units that in at least one step of the search had a residual smaller than selunit[1] or greater than sellunit[2] will have a textbox. If it is a string it specifies the threshold above which labels have to be put. For example selunit='2.6' means that the text labels are written only for the units which have in at least one step of the search a value of the scaled residual greater than 2.6 in absolute value. If it is a numeric vector it contains the list of the units for which it is necessary to put the text labels. The default value of selunit is string '2.5' if y is an object else it is an empty value.

trace

Whether to print intermediate results. Default is trace=FALSE.

...

potential further arguments passed to lower level functions.

Value

none

Author(s)

FSDA team, valentin.todorov@chello.at

See Also

spmplot, mdrplot, resfwdplot

Examples


 ## Not run: 
 ##  Example of the use of function regspmplot with all the default options
 ##  regsmplot() with first argument vector y and no option.
 ##  In the first example as input there are two matrices: y and X respectively
 ##  A simple plot is created

 n <- 100
 p <- 3
 X <- matrix(data=rnorm(n*p), nrow=n, ncol=p)
 y <- matrix(data=rnorm(n*1), nrow=n, ncol=1)
 regspmplot(y, X)

 ##  Example of the use of function regspmplot with first argument
 ##  vector y and third argument group.
 ##  Different groups are shown in the yXplot

 group <- rep(0, n)
 group[1:(n/2)] <- rep(1, n/2)
 regspmplot(y, X, group)

 ##  Example of the use of function regspmplot with first argument
 ##  vector y, third argument group and fourth argument plot
 ##  (Ex1) plot=TRUE

 regspmplot(y, X, group, plot=TRUE)

 ##  (Ex1) Set the scale for the x axes, the y axis and control symbol type
 regspmplot(y, X, group, xlim=c(-1,2), ylim=c(0,2), pch=c(10,11), trace=TRUE)

 ##  When the first input argument is an object.
 ##  In the following example the input is an object which also contains
 ##  information about the forward search.
     (out <- fsreg(y~X, method="LMS", control=LXS_control(nsamp=1000)))
     (out <- fsreg(y~X, bsb=out$bs, monitoring=TRUE))

     regspmplot(out, plot=0)

 
## End(Not run)


fsdaR documentation built on March 31, 2023, 8:18 p.m.