pcaOutId: Identify outliers in a PCA plot based on expansion of...

Description Usage Arguments Details Value See Also

Description

identify potential analytical/preparative outliers on a pca scores plot based on a proportional expansion of the Hotelling's T2 ellipse using the pcaMethods package.

Usage

1
2
pcaOutId(peakTable = NULL, obsNames = NULL, outTol = 1.2, maxIter = 2,
  ...)

Arguments

peakTable

either a data.frame, full file path as a character string to a .csv file of a peak table in the form observation (samples) in columns and variables (Mass spectral signals) in rows. If argument is not supplied a GUI file selection window will open and a .csv file can be selected.

obsNames

character vector of observation (i.e. sample/ QC/ Blank) names to identify appropriate observation (sample) columns.

outTol

proportional expansion value for Hotelling's ellipse (PC1 and PC2), any outlying samples beyond this will be removed. default = 1.2

maxIter

number of iterations of pca model calculation/ outlier removal to perform. The iteration process will stop if no further outliers are detected and the last PCA model calculated. default = 2

...

additional arguments to pca.

Details

principal components analysis is a commonly used method to identify clustering and potentially outlying samples in multivariate datasets. The Hotellings student's T2 distribution is also commonly used to detect strongly outlying samples on the resulting scores plot. Weakly outlying samples (perhaps biological rather than analytical) in origin will appear close to the Hotelling's ellipse therefore a small proportional expansion of the ellipse can be used to automatically remove strongly outlying samples whilst retaining weakly outlying samples. The first two principal components (PCs) represent the greatest sources of systematic variation in multivariate datasets therefore considered of the first two PCs should be sufficient to identify strong analytical outliers, representing such things as, strongly contaminated biological samples, failed LC autosampler injections, experimental preparation errors and temporal errors in mass spectrometer performance etc.

Value

a list containing two elements the original peak table with outliers removed and a nested list with a sub list for each iteration of pca calculation, and outlier identification. Each pca result sublist consists of 3 elements:

1. a pcaRes object of the pca calculation.

2. the coordinates of the expanded hotelling's ellipse.

3. a named logical vector of each outlier detected.

See Also

pca, pcaRes


WMBEdmands/MetMSLine documentation built on May 9, 2019, 10:03 p.m.