# andrews: Andrew's plots In mbojan/mbtools: Chaotic Collection of Functions and Datasets Possibly Useful Also To Others

## Description

Andrew's plot is an exploratory technique for identifying clusters of similar observations.

## Usage

 ``` 1 2 3 4 5 6 7 8 9 10 11 12``` ```andrews(x, ...) ## S3 method for class 'matrix' andrews(x, draw = TRUE, res = 20, w = seq(-pi, pi, length = res), main = "Andrew's plot", xlab = "t", ylab = "f(t)", pch = 1, lty = 1, col = 1, ...) ## S3 method for class 'data.frame' andrews(x, ...) ## S3 method for class 'data.frame' andrews(x, ...) ```

## Arguments

 `x` numeric matrix, vector or data frame that contains only numeric variables `...` other arguments passed to other methods, `matplot` in the end `draw` logical, whether the plot should be produced `res` numeric, number of points on which transformed variables are evaluated, see Details `w` numeric vector, sequence of points on which transformed variables are evaluated `main, xlab, ylab, pch, lty, col` arguments passed to `matplot`, see `par`

## Details

Andrew's plot shows each observation in a multivariate data set as a curve over [-pi; pi] interval. Formally, each observation x = (x_1, x_2, ..., x_p) is transformed according to the following formula (from Everitt (1993)):

1/sqrt(2) x_1 + x_2 sin(t) + x_3 cos(t) x_4 sin(2t) + x_5 cos(2t) ...

1/sqrt(2) x_1 + x_2 sin(t) + x_3 cos(t) x_4 sin(2t) + x_5 cos(2t) ...

and plotted against the above mentioned interval. The transformation preserves Euclidean distances so if two curves are identical so are the observations.

By default the functins are evaluated on an equally-spaced interval from -pi to pi of the length provided by `res`. Custom intervals can be constructed via `w` argument.

Other arguments are passed to `matplot`.

## Value

Depending on the value of the `draw` argument the plot is produced (default) or not. In both cases the function returns a matrix of transformed observations invisibly.

## References

Everitt, B. S. (1993) "Cluster Analysis", New York: John Wiley and Sons

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31``` ```### Using artificial data d <- data.frame( x = c( rnorm(50, 1), rnorm(50, 4), rnorm(50, 7)), y = c( rnorm(50,1), rnorm(50, 7), rnorm(50, 1)), k = rep(1:3, each=50) ) # plotting # show data plot(d\$x, d\$y, pch=d\$k) # Andrew's plots layout( matrix(1:4, ncol=2)) andrews( d[1:2], main="Unsupervised x,y") andrews( d[2:1], main="Unsupervised y,x") andrews( d[1:2], col=d\$k, main="Color-coded x,y") andrews( d[2:1], col=d\$k, main="Color-coded y,x") # three curves for cluster means andrews( cbind( tapply(d\$x, d\$k, mean), tapply(d\$y, d\$k, mean)), col=1:3 ) ### Using 'iris' data d <- iris[1:4] # Andrew's plots layout( matrix(1:4, ncol=2) ) # "unsupervised" andrews(d, lty=1, col=1, main="Andrew's plot of Iris data") # colored with species andrews(d, lty=1, col=match( iris\$Species, unique(iris\$Species)), main="Andrew's plot of Iris data\n color-coded species") # Andrew's plot on standardized data andrews( scale(d), main="Andrew's plot of standardized Iris data") # Andrew's plot on principal components pcad <- princomp(d) andrews( pcad\$scores, main="Andrew's plot of PCA of Iris data") ```