# Using Mahalanobis Distance and PCA for Quality Control

### Description

Compute the Mahalanobis distance of each sample from the center of an
`N`-dimensional principal component space.

### Usage

1 | ```
mahalanobisQC(spca, N)
``` |

### Arguments

`spca` |
object of class |

`N` |
integer scalar specifying the number of components to use when assessing QC. |

### Details

The theory says that, under the null hypothesis that all samples arise
from the same multivariate normal distribution, the distance from the
center of a `D`-dimensional principal component space should follow a
chi-squared distribution with `D` degrees of freedom. This theory lets
us compute p-values associated with the Mahalanobis distances for
each sample. This method can be used for quality control or outlier
identification.

### Value

Returns a data frame containing two columns, with the rows
corresponding to the columns of the original data set on which PCA was
performed. First column is the chi-squared statistic, with `N`

degrees of freedom. Second column is the associated p-value.

### Author(s)

Kevin R. Coombes krc@silicovore.com

### References

Coombes KR, et al.

*Quality control and peak finding for proteomics data collected from
nipple aspirate fluid by surface-enhanced laser desorption and ionization.*
Clin Chem 2003; 49:1615-23.

### Examples

1 2 3 4 5 | ```
library(oompaData)
data(lungData)
spca <- SamplePCA(na.omit(lung.dataset))
mc <- mahalanobisQC(spca, 2)
mc[mc$p.value < 0.01,]
``` |