Function to Plot Principal Component Analysis Loadings and Scores

Share:

Description

Function to display the results of a Principal Components Analysis (PCA) from the saved object from gx.mva, gx.mva.closed, gx.robmva, gx.robmva.closed or gx.rotate as biplots. Various options for displaying loadings and scores are available, see Details below.

Usage

1
2
3
4
gx.rqpca.plot(save, v1 = 1, v2 = 2, rplot = TRUE, qplot = TRUE,
	rowids = NULL, ifrot = TRUE, main = "", cex.lab = 0.9,
	cex.main = 0.9, rcex = 1, qcex = 0.8, rcol = 1, qcol = 1,
	ifray = TRUE, if34 = TRUE, ...)

Arguments

save

a saved object from the execution of function gx.mva, gx.mva.closed, gx.robmva,
gx.robmva.closed or gx.rotate.

v1

the component to be plotted on the x-axis of the biplot, default is the first component, v1 = 1.

v2

the component to be plotted on the y-axis of the biplot, default is the second component, v1 = 2.

rplot

the default is to plot the variables. If the variables are not required set rplot = FALSE. Note, if an ilr transform has been undertaken the loadings of the (p-1) synthetic variables will be displayed.

qplot

the default is to plot the observation (individual, case or sample) scores. If scores are not required set qplot = FALSE.

rowids

‘switch’ to determine if the input matrix row numbers are to be displayed instead of default plotting symbols. The default is for default plotting symbols, i.e. rowids = NULL, set rowids = TRUE if the row numbers are to be displayed, or set rowids = FALSE for the sample IDs to be displayed if they are present in the input matrix, if they are not, row numbers will be displayed.

ifrot

by default the post-Varimax rotation scores are displayed if a rotation has been made, see gx.rotate. If rotated scores are available in the saved object but the unrotated biplot is to be displayed set ifrot = FALSE.

main

an alternate plot title from that generated automatically from information in the saved object, see Details below.

cex.lab

the text scale expansion factor for the axis labels of the display, by default cex.axis = 0.9, a 10% font size reduction.

cex.main

the text scale expansion factor for the display title, by default cex.axis = 0.9, a 10% font size reduction.

rcex

the text scale expansion factor for the variable names in the display, by default cex = 1, 100%, no font size reduction.

qcex

the text scale expansion factor for the observation symbols, row numbers or sample IDs in the display, by default cex = 0.8, a 20% font size reduction.

rcol

the colour of the variable names in the display, by default rcol = 1, black, setting rcol = 2 will result in red variable names (see display.lty for the default colour palette).

qcol

the colour of the observation symbols, row numbers or sample IDs in the display, by default qcol = 1, black, setting qcol = 4 will result in blue variable names (see display.lty for the default colour palette).

ifray

by default ‘rays’ are plotted from the R loadings origin to the variable name in the same colour as the variable names, to omit the ‘rays’ set ifray = FALSE.

if34

this applies to a robust biplot only, by default the top (3) and right side (4) of the plot frame are annotated in black with the R loadings scale and with axis titles in the same colour as the variable names. To suppress these annotations set if34 = FALSE. When this extra annotation is provided the top (3) scale and annotation overwrites the plot title, therefore, set main = " " to suppress adding the plot title.

...

further arguments to be passed to methods concerning the plot.

Details

If main is undefined the name of the matrix object supplied to the function is displayed in the plot title. On the line below the name of the data matrix from which the PCA was derived is displayed. However, if an alternate plot title is preferred it may be defined, e.g., main = "Plot Title Text". If no plot title is required set main = " ".

If the variable names are longer than three characters the display can easily become cluttered. In which case the user should redefine the variable names in the input matrix from which the PCA was derived using the dimnames(matrix.name)[[2]] construct, and run the generating function again. Alternately, the variable names in the saved object may be changed directly via a redefinition of save$matnames[[2]].

Information on the percentage of the variability explained by each component, and whether or not rotation has been undertaken, is recovered from the saved object and used to appropriately label the plot axes. Note that for non-robust models the percentage variability explained will be the same as the percentage variability explained by the corresponding eigenvalues.

The following describes the available plot option combinations, the first being the default:
rplot = TRUE & qplot = TRUE & rowids = NULL, crosses (pch default) and variable names
rplot = TRUE & qplot = FALSE & rowids = NULL, variable names only
rplot = FALSE & qplot = TRUE & rowids = NULL, crosses (pch default) only
rplot = FALSE & qplot = TRUE & rowids = TRUE, input matrix row numbers only
rplot = FALSE & qplot = TRUE & rowids = FALSE, input matrix row identifiers
rplot = TRUE & qplot = TRUE & rowids = TRUE, input matrix row numbers and variable names

Because functions gx.mva, gx.mva.closed, gx.robmva or gx.robmva.closed require a matrix as input the sample IDs that may be in a data frame may be lost. They may be re-inserted by copying dimnames(...)[[1]] from a data frame into the matrix. Alternately, to plot in the component score space with Sample IDs, the scores can be recovered from the saved object, e.g., save$rqscore[, 1] and save$rqscore[, 2], and used as the x- and y-coordinates in function xyplot.tags with the sample IDs from the source data frame. Appropriate plot and axis titling can be displayed by setting the function arguments ‘by hand’.

Note

The scaling for the PCA ensures that the loadings and scores may be plotted in the same numerical space to create a biplot. However, when robust covariance matrix and mean estimation is employed this is not possible due to the extreme values some scores may attain for outliers. In this case the variables are plotted in the same physical space but with different scaling and addition scaling provided for the top and right side of the plot frame.

Author(s)

Robert G. Garrett

References

Reimann, C., Filzmoser, P., Garrett, R. and Dutter, R., 2008. Statistical Data Analysis Explained: Applied Environmental Statistics with R. John Wiley & Sons, Ltd., 362 p.

Venables, W.N. and Ripley, B.D., 2001. Modern Applied Statistics with S-Plus, 3rd Edition, Springer, 501 p.

See Also

gx.mva, gx.mva.closed, gx.robmva, gx.robmva.closed, gx.rotate, xyplot.tags

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
## Make test data available
data(sind)
data(sind.mat2open)
attach(sind)

## Save PCA results and display biplots
sind.save <- gx.mva.closed(sind.mat2open)
gx.rqpca.plot(sind.save)
gx.rqpca.plot(sind.save,
main = "Howarth & Sinding Larsen Stream Sediments\nclr transform",
pch = 4, rcol =2, qcol = 4)
gx.rqpca.plot(sind.save, rplot = TRUE, qplot = FALSE, rowids = NULL)
gx.rqpca.plot(sind.save, rplot = FALSE, qplot = TRUE, rowids = NULL)
gx.rqpca.plot(sind.save, rplot = FALSE, qplot = TRUE, rowids = TRUE)
gx.rqpca.plot(sind.save, rplot = TRUE, qplot = TRUE, rowids = FALSE,
rcol = 2, qcol = 4)

##
## Alternately
xyplot.tags(sind.save$rqscore[, 1],sind.save$rqscore[, 2], ID, cex = 0.9)

## Clean-up and detach test data
rm(sind.save)
detach(sind)

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.