Mahalanobis: Mahalanobis

Description Usage Arguments Details Author(s) Examples

View source: R/distanceFunctions.R

Description

Calculates the Mahalanobis distance for each row (locus, SNP) in the data frame. Data are subset prior to calculating distances (see details).

Usage

1
2
Mahalanobis(dfv, column.nums = 1:ncol(dfv), subset = 1:nrow(dfv),
  S = NULL, M = NULL)

Arguments

dfv

a data frame containing observations in rows and statistics in columns.

column.nums

indexes the columns of the data frame that will be used to calculate Mahalanobis distance (all other columns are ignored).

subset

index the rows of the data frame that will be used to calculate the mean and covariance of the distribution (unless specified manually).

S

the covariance matrix used to normalise the data in the Mahalanobis calculation. Leave as NULL to use the ordinary covariance matrix calculated using cov(dfv[subset,column.nums],use="pairwise.complete.obs").

M

the point that Mahalanobis distance is measured from. Leave as NULL to measure distance from the mean of dfv[subset,column.nums].

Details

Under default options the standard Mahalanobis calculation is used, based on the mean and covariance matrix of the data. Addition arguments can be used to specify the mean and covariance matrix manually, or to define a subset of points that are used in the calculation. The input data frame can handle some missing data, as long as a covariance matrix can still be computed using the function cov(dfv[subset,column.nums],use="pairwise.complete.obs").

Author(s)

Robert Verity r.verity@imperial.ac.uk

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
## Not run: 
#' # create a matrix of observations
df <- data.frame(x=rnorm(100),y=rnorm(100))

# calculate Mahalanobis distances
distances <- Mahalanobis(df)

# use this distance to look for outliers
Q95 <- quantile(distances, 0.95)
which(distances>Q95)

## End(Not run)

NESCent/MINOTAUR documentation built on May 7, 2019, 6:01 p.m.