spmplot: Interactive scatterplot matrix

View source: R/spmplot.R

spmplotR Documentation

Interactive scatterplot matrix

Description

Produces an interactive scatterplot matrix with boxplots or histograms on the main diagonal and possibly robust bivariate contours

Usage

spmplot(
  X,
  group,
  plot,
  variables,
  col,
  cex,
  pch,
  labeladd,
  label,
  legend,
  dispopt = c("hist", "box"),
  tag,
  datatooltip,
  databrush,
  trace = FALSE,
  ...
)

Arguments

X

data matrix (2D array) containing n observations on p variables or an object of S3 class fsmeda.object returned by fsmult with monitoring=TRUE - a list containing the monitoring of minimum Mahalanobis distance

group

grouping variable. Vector with n elements. Specifies a grouping variable defined as a categorical variable (factor), numeric, or array of strings, or string matrix, and it must have the same number of rows as X. This grouping variable determines the marker and color assigned to each point. Remark: if group is used to distinguish a set of outliers from a set of good units, the id number for the outliers should be the larger (see optional field labeladd of parameter plot for details).

plot

controls the names which are displayed in the margins of the scatter-plot matrix, the labels of the legend the colors and the symbols. If plot is empty (plot=FALSE or plot=0 or plot=c() or plot=NULL) empty strings are displayed and no label and no name is added to the plot. If plot=TRUE or plot=1, the names Y1,..., Yp are added to the margins of the the scatter plot matrix else nothing is added. If plot is a list, it is possible to control not only the names but also, point labels, colors and symbols. More precisely list plot may contain the following elements:

  1. labeladd - see parameter labeladd

  2. nameY - a character string containing the labels of the variables. As default value, the labels which are added are Y1, ..., Yp. See parameter variables.

  3. clr - see parameter col

  4. sym - see parameter pch

  5. siz - see parameter cex

  6. doleg - see parameter legend

  7. label - see parameter label

variables

a character string with the names of the variables

col

color specification for the data point. Can be different for each group. By default, the order of the colors is blue, red, black, magenta, green, cyan and yelow.

cex

the size of the symbols used for plotting. By default cex=1 the symbol size depends on the number of plots and the size of the figure window. Values larger than 1 will increase the size and values smaller than 1 will decrease the size.

pch

specification of the symbols to use. For example, if there are three groups, and pch=c(1, 3, 4), the first group will be plotted with a circle, the second with a plus, and the third with a 'x' (see ?pch or ?points for a list of symbols. NOTE: not all symbols available in R can be mapped to the symbols in MATLAB.

labeladd

logical, controls wheather the elements belonging to the last group in the scatterplot matrix are labelled with their unit row index or their rowname. The rowname is taken from the parameter label or if it is missing, from the sequence 1:n. The default value is labeladd=FALSE, i.e. no label is added.

label

a character vector of length n (the number of rows in the data matrix) containing the labels of the units. If this field is empty the sequence 1:n will be used to label the units.

legend

logical, controls where a legend is shown or not.

dispopt

controls how to fill the diagonals in the plot (main diagonal of the scatter plot matrix). Set dispopt='hist' (default) to plot histograms, or dispopt='box' to plot boxplots. The style which is used for univariate boxplots is traditional, if the number of groups is less or equal 5, else it is 'compact' (plot boxes using a smaller box style designed for plots with many groups).

tag

Plot handle. String which identifies the handle of the plot which is about to be created. The default is tag='pl_mmd'. Notice that if the program finds a plot which has a tag equal to the one specified by the user, then the output of the new plot overwrites the existing one in the same window else a new window is created.

datatooltip

If datatooltip is not empty the user can use the mouse in order to have information about the unit selected, the step in which the unit enters the search and the associated label. If datatooltip is a list, it is possible to control the aspect of the data cursor (see MATLAB function datacursormode() for more details or see the examples below). The default options are DisplayStyle="Window" and SnapToDataVertex="on".

databrush

Interactive mouse brushing. If databrush is missing or empty (default), no brushing is done. The activation of this option (databrush is TRUE or a list) enables the user to select a set of trajectories in the current plot and to see them highlighted in the scatterplot matrix. If the scatterplot matrix does not exist it is automatically created. In addition, brushed units can be highlighted in the monitoring MD plot. Note that the window style of the other figures is set equal to that which contains the monitoring residual plot. In other words, if the monitoring residual plot is docked all the other figures will be docked too.

If databrush=TRUE the default selection tool is a rectangular brush and it is possible to brush only once (that is persist=”).

If databrush=list(...), it is possible to use all optional arguments of the MATLAB function selectdataFS() and the following optional arguments:

  • persist: This option can be an empty value or a character containing 'on' or 'off'. The default value is persist="", that is brushing is allowed only once. If persist="on" or persis="off" brushing can be done as many time as the user requires. If persist='on' then the unit(s) currently brushed are added to those previously brushed. It is possible, every time a new brushing is done, to use a different color for the brushed units. If persist='off' every time a new brush is performed units previously brushed are removed.

  • labeladd: add labels of brushed units in the scatterplot matrix. If this option is '1', we label the units of the last selected group with the unit row index in the matrix X. The default value is labeladd=”, i.e. no label is added.

trace

Whether to print intermediate results. Default is trace=FALSE.

...

potential further arguments passed to lower level functions.

Value

none

Author(s)

FSDA team, valentin.todorov@chello.at

Examples


 ## Not run: 
 ##  Call of spmplot() without optional parameters.
 ##  Iris data: scatter plot matrix with univariate boxplots on the main
 ##  diagonal.

 X <- iris[,1:4]
 group <- iris[,5]
 spmplot(X, group, variables=c('SL','SW','PL','PW'), dispopt="box")


 ##  Example of spmplot() called by routine fsmult().
 ##  Generate contaminated data.
     n <- 200; p <- 3
     X <- matrix(rnorm(n*p), ncol=3)
     Xcont <- X
     Xcont[1:5,] <- Xcont[1:5,] + 3

 ##  spmplot is called automatically by all outlier detection methods, e.g. fsmult()
     out <- fsmult(Xcont, plot=TRUE);

 ##  Now test the direct use of fsmult(). Set two groups, e.g. those obtained
 ##  from fsmult().

     group = rep(0, n)
     group[out$outliers] <- 1
 ##  option 'labeladd' is used to label the outliers
 ##  By default, the legend identifies the groups with the identifiers
 ##  given in vector 'group'.
 ##  Set the colors for the two groups to blue and red.

     spmplot(Xcont, group, col=c("blue", "red"), labeladd=1, dispopt="box")
 
## End(Not run)


fsdaR documentation built on March 31, 2023, 8:18 p.m.