scatter.hist: Draw a scatter plot with associated X and Y histograms,...

scatterHistR Documentation

Draw a scatter plot with associated X and Y histograms, densities and correlation

Description

Draw a X Y scatter plot with associated X and Y histograms with estimated densities. Will also draw density plots by groups, as well as distribution ellipses by group. Partly a demonstration of the use of layout. Also includes lowess smooth or linear model slope, as well as correlation and Mahalanobis distances

Usage

scatterHist(x,y=NULL,smooth=TRUE,ab=FALSE, correl=TRUE,data=NULL, density=TRUE,means=TRUE, 
   ellipse=TRUE,digits=2,method="pearson",cex.cor=1,cex.point=1,
   title="Scatter plot + density",
   xlab=NULL,ylab=NULL,smoother=FALSE,nrpoints=0,xlab.hist=NULL,ylab.hist=NULL,grid=FALSE,
   xlim=NULL,ylim=NULL,x.breaks=11,y.breaks=11,
   x.space=0,y.space=0,freq=TRUE,x.axes=TRUE,y.axes=TRUE,size=c(1,2),
   col=c("blue","red","black"),legend=NULL,alpha=.5,pch=21, show.d=TRUE,
   x.arrow=NULL,y.arrow=NULL,d.arrow=FALSE,cex.arrow=1,...)
 
scatter.hist(x,y=NULL,smooth=TRUE,ab=FALSE, correl=TRUE,data=NULL,density=TRUE,
  means=TRUE, ellipse=TRUE,digits=2,method="pearson",cex.cor=1,cex.point=1,
  title="Scatter plot + density",
  xlab=NULL,ylab=NULL,smoother=FALSE,nrpoints=0,xlab.hist=NULL,ylab.hist=NULL,grid=FALSE,
  xlim=NULL,ylim=NULL,x.breaks=11,y.breaks=11,
  x.space=0,y.space=0,freq=TRUE,x.axes=TRUE,y.axes=TRUE,size=c(1,2),
  col=c("blue","red","black"),legend=NULL,alpha=.5,pch=21, show.d=TRUE,
   x.arrow=NULL,y.arrow=NULL,d.arrow=FALSE,cex.arrow=1,...)

Arguments

x

The X vector, or the first column of a data.frame or matrix. Can be specified using formula input.

y

The Y vector, of if X is a data.frame or matrix, the second column of X

smooth

if TRUE, then add a loess smooth to the plot

ab

if TRUE, then show the best fitting linear fit

correl

TRUE: Show the correlation

data

if using formula input, the data must be specified

density

TRUE: Show the estimated densities

means

TRUE: If TRUE, show the means for the distributions.

ellipse

TRUE: draw 1 and 2 sigma ellipses and smooth

digits

How many digits to use if showing the correlation

method

Which method to use for correlation ("pearson","spearman","kendall") defaults to "pearson"

smoother

if TRUE, use smoothScatter instead of plot. Nice for large N.

nrpoints

If using smoothScatter, show nrpoints as dots. Defaults to 0

grid

If TRUE, show a grid for the scatter plot.

cex.cor

Adjustment for the size of the correlation

cex.point

Adjustment for the size of the data points

xlab

Label for the x axis

ylab

Label for the y axis

xlim

Allow specification for limits of x axis, although this seems to just work for the scatter plots.

ylim

Allow specification for limits of y axis

x.breaks

Number of breaks to suggest to the x axis histogram.

y.breaks

Number of breaks to suggest to the y axis histogram.

x.space

space between bars

y.space

Space between y bars

freq

Show frequency counts, otherwise show density counts

x.axes

Show the x axis for the x histogram

y.axes

Show the y axis for the y histogram

size

The sizes of the ellipses (in sd units). Defaults to 1,2

col

Colors to use when showing groups

alpha

Amount of transparency in the density plots

legend

Where to put a legend c("topleft","topright","top","left","right")

pch

Base plot character (each group is one more)

xlab.hist

Not currently available

ylab.hist

Label for y axis histogram. Not currently available.

title

An optional title

show.d

If TRUE, show the distances between the groups

d.arrow

If TRUE, draw an arrow between the two centroids

x.arrow

optional lable for the arrow connecting the two groups for the x axis

y.arrow

optional lable for the arrow connecting the two groups for the y axis

cex.arrow

cex control for the label size of the arrows.

...

Other parameters for graphics

Details

Just a straightforward application of layout and barplot, with some tricks taken from pairs.panels. The various options allow for correlation ellipses (1 and 2 sigma from the mean), lowess smooths, linear fits, density curves on the histograms, and the value of the correlation. ellipse = TRUE implies smooth = TRUE. The grid option provides a background grid to the scatterplot.

If using grouping variables, will draw ellipses (defaults to 1 sd) around each centroid. This is useful when demonstrating Mahalanobis distances.

Formula input allows specification of grouping variables as well. )

For plotting data for two groups, Mahalobnis differences between the groups may be shown by drawing an arrow between the two centroids. This is a bit messy and it is useful to use pch="." in this case.

Note

Originally adapted from Addicted to R example 78. Modified following some nice suggestions from Jared Smith. Substantial revisions in 2021 to allow for a clearer demonstration of group differences.

Author(s)

William Revelle

See Also

pairs.panels for multiple plots, multi.hist for multiple histograms and histBy for single variables with multiple groups. Perhaps the best example is found in the psychTools::GERAS data set.

Examples

data(sat.act)
with(sat.act,scatterHist(SATV,SATQ))
scatterHist(SATV ~ SATQ,data=sat.act)  #formula input

#or for something a bit more splashy
scatter.hist(sat.act[5:6],pch=(19+sat.act$gender),col=c("blue","red")[sat.act$gender],grid=TRUE)
#better yet
scatterHist(SATV ~ SATQ + gender,data=sat.act) #formula input with a grouping variable

psych documentation built on June 27, 2024, 5:07 p.m.