andrews: Andrew's plots

Description Usage Arguments Details Value References See Also Examples

View source: R/andrews.R

Description

Andrew's plot is an exploratory technique for identifying clusters of similar observations.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
andrews(x, ...)

## S3 method for class 'matrix'
andrews(x, draw = TRUE, res = 20, w = seq(-pi, pi, length
  = res), main = "Andrew's plot", xlab = "t", ylab = "f(t)", pch = 1,
  lty = 1, col = 1, ...)

## S3 method for class 'data.frame'
andrews(x, ...)

## S3 method for class 'data.frame'
andrews(x, ...)

Arguments

x

numeric matrix, vector or data frame that contains only numeric variables

...

other arguments passed to other methods, matplot in the end

draw

logical, whether the plot should be produced

res

numeric, number of points on which transformed variables are evaluated, see Details

w

numeric vector, sequence of points on which transformed variables are evaluated

main, xlab, ylab, pch, lty, col

arguments passed to matplot, see par

Details

Andrew's plot shows each observation in a multivariate data set as a curve over [-pi; pi] interval. Formally, each observation x = (x_1, x_2, ..., x_p) is transformed according to the following formula (from Everitt (1993)):

1/sqrt(2) x_1 + x_2 sin(t) + x_3 cos(t) x_4 sin(2t) + x_5 cos(2t) ...

1/sqrt(2) x_1 + x_2 sin(t) + x_3 cos(t) x_4 sin(2t) + x_5 cos(2t) ...

and plotted against the above mentioned interval. The transformation preserves Euclidean distances so if two curves are identical so are the observations.

By default the functins are evaluated on an equally-spaced interval from -pi to pi of the length provided by res. Custom intervals can be constructed via w argument.

Other arguments are passed to matplot.

Value

Depending on the value of the draw argument the plot is produced (default) or not. In both cases the function returns a matrix of transformed observations invisibly.

References

Everitt, B. S. (1993) "Cluster Analysis", New York: John Wiley and Sons

See Also

Package cluster

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
### Using artificial data
d <- data.frame( x = c( rnorm(50, 1), rnorm(50, 4), rnorm(50, 7)),
    y = c( rnorm(50,1), rnorm(50, 7), rnorm(50, 1)),
    k = rep(1:3, each=50) )
# plotting
# show data
plot(d$x, d$y, pch=d$k)
# Andrew's plots
layout( matrix(1:4, ncol=2))
andrews( d[1:2], main="Unsupervised x,y")
andrews( d[2:1], main="Unsupervised y,x")
andrews( d[1:2], col=d$k, main="Color-coded x,y")
andrews( d[2:1], col=d$k, main="Color-coded y,x")
# three curves for cluster means
andrews( cbind( tapply(d$x, d$k, mean), tapply(d$y, d$k, mean)),
    col=1:3 )

### Using 'iris' data
d <- iris[1:4]
# Andrew's plots
layout( matrix(1:4, ncol=2) )
# "unsupervised"
andrews(d, lty=1, col=1, main="Andrew's plot of Iris data")
# colored with species
andrews(d, lty=1, col=match( iris$Species, unique(iris$Species)),
    main="Andrew's plot of Iris data\n color-coded species")
# Andrew's plot on standardized data
andrews( scale(d), main="Andrew's plot of standardized Iris data")
# Andrew's plot on principal components
pcad <- princomp(d)
andrews( pcad$scores, main="Andrew's plot of PCA of Iris data")

mbojan/mbtools documentation built on Nov. 9, 2017, 3:21 p.m.