The function carries out a Principal Components Analysis (PCA) and estimates the Mahalanobis distances for a dataset and places them in an object to be saved and post-processed for display and further manipulation. Classical procedures are used, for robust procedures see `gx.robmva`

. For results display see `gx.rqpca.screeplot`

, `gx.rqpca.loadplot`

, `gx.rqpca.plot`

, `gx.rqpca.print`

, `gx.md.plot`

and `gx.md.print`

. For Kaiser varimax rotation see `gx.rotate`

. For closed. compositional, data use `gx.mva.closed`

.

1 | ```
gx.mva(xx, main = deparse(substitute(xx)))
``` |

`xx` |
a |

`main` |
by default the name of the object |

If `main`

is undefined the name of the matrix object passed to the function is used to identify the object. This is the recommended procedure as it helps to track the progression of a data analysis. Alternate plot titles are best defined when the saved object is passed to `gx.rqpca.plot`

, `gx.rqpca.screeplot`

or `gx.md.plot`

for display. If no plot title is required set `main = " "`

, or if a user defined plot title is required it may be defined, e.g., `main = "Plot Title Text"`

.

The following are returned as an object to be saved for subsequent display, etc.:

`main` |
by default (recommended) the input data matrix name. |

`input` |
the data matrix name, |

`proc` |
the procedure used, by default |

`n` |
the total number of individuals (observations, cases or samples) in the input data matrix. |

`nc` |
the number of individuals remaining in the ‘core’ data subset after trimming. At this stage of a data analysis |

`p` |
the number of variables on which the multivariate operations were based. |

`ifilr` |
flag for |

`matnames` |
the row numbers or identifiers and column headings of the input matrix. |

`wts` |
the vector of weights for the |

`mean` |
the vector the weighted means for the |

`cov` |
the |

`sd` |
the vector of weighted standard deviations for the |

`snd` |
the |

`r` |
the |

`eigenvalues` |
the vector of |

`econtrib` |
the vector of |

`eigenvectors` |
the |

`rload` |
the |

`rcr` |
the |

`rqscore` |
the |

`vcontrib` |
a vector of |

`pvcontrib` |
the vector of |

`cpvcontrib` |
the vector of |

`md` |
the vector of |

`ppm` |
the vector of |

`epm` |
the vector of |

`nr` |
the number of PCs that have been rotated. At this stage of a data analysis |

Any less than detection limit values represented by negative values, or zeros or other numeric codes representing blanks in the data, must be removed prior to executing this function, see `ltdl.fix.df`

.

Any rows in the data matrix with `NA`

s are removed prior to computions. In the instance of a compositional data opening transformation `NA`

s have to be removed prior to undertaking the transformation, see `na.omit`

, `where.na`

and `remove.na`

. When that procedure is followed the opening transformations may be executed on calling the function, see Examples below.

Note that, executing a `clr`

transformation leads to a singular matrix that can not be inverted for the estimation of Mahalanobis distances. In that case the values of `md`

, `ppm`

and `epm`

are all set to `NULL`

.

Note that, executing a `ilr`

transformation permits the estimation of Mahalanobis distances and associated probabilities through the use of `p-1`

synthetic variables. However, in that instance the loadings of the `p-1`

synthetic variables will be plotted by `gx.rqpca.plot`

rather than the loadings for the elements.

Therefore, use function `gx.mva.closed`

for compositional, geochemical, data.

Warnings are generated when the number of individuals (observations, cases or samples) falls below `5p`

, and additional warnings when the number of individuals falls below `3p`

. At these low ratios of individuals to variables the shape of the `p`

-space hyperellipsoid is difficult to reliably define, and therefor the results may lack stability. These limits `5p`

and `3p`

are generous, the latter especially so; many statisticians would argue that the number of individuals should not fall below `9p`

, see Garrett (1993).

Robert G. Garrett

Garrett, R.G., 1990. A robust multivariate allocation procedure with applications to geochemical data. In Proc. Colloquium on Statistical Applications in the Earth Sciences (Eds F.P. Agterberg & G.F. Bonham-Carter). Geological Survey of Canada Paper 89-9, pp. 309-318.

Garrett, R.G., 1993. Another cry from the heart. Explore - Assoc. Exploration Geochemists Newsletter, 81:9-14.

Grunsky, E.C., 2001. A program for computing RQ-mode principal components analysis for S-Plus and R. Computers & Geosciences, 27(2):229-235.

Reimann, C., Filzmoser, P., Garrett, R. and Dutter, R., 2008. Statistical Data Analysis Explained: Applied Environmental Statistics with R. John Wiley & Sons, Ltd., 362 p.

`ltdl.fix.df`

, `remove.na`

, `na.omit`

, `gx.rqpca.screeplot`

, `gx.rqpca.loadplot`

, `gx.rqpca.plot`

, `gx.rqpca.print`

, `gx.md.plot`

, `gx.md.print`

, `gx.robmva`

, `gx.robmva.closed`

, `gx.rotate`

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | ```
## Make test data available
data(sind.mat2open)
## Generate gx.mva object, for demonstration purposes only
## These are compositional data - gx.mva.closed should be used
sind.save <- gx.mva(sind.mat2open)
gx.rqpca.screeplot(sind.save)
gx.rqpca.loadplot(sind.save)
gx.rqpca.plot(sind.save)
## Display saved object with alternate main titles
gx.rqpca.loadplot(sind.save,
main = "Howarth & Sinding-Larsen\nStream Sediments, clr Transformed Data",
cex.main = 0.8)
gx.rqpca.plot(sind.save,
main = "Howarth & Sinding-Larsen\nStream Sediments, clr Transformed Data",
cex.main = 0.8)
## Display Mahalanobis distances in a Chi-square plot
gx.md.plot(sind.save)
## Display saved object with alternate main titles
gx.md.plot(sind.save,
main = "Howarth & Sinding-Larsen\nStream Sediments, ilr Transformed Data",
cex.main = 0.8)
## Clean-up
rm(sind.save)
``` |

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.

All documentation is copyright its authors; we didn't write any of that.