Description Usage Arguments Details Author(s) Examples
View source: R/distanceFunctions.R
Calculates the Mahalanobis distance for each row (locus, SNP) in the data frame. Data are subset prior to calculating distances (see details).
1 2 |
dfv |
a data frame containing observations in rows and statistics in columns. |
column.nums |
indexes the columns of the data frame that will be used to calculate Mahalanobis distance (all other columns are ignored). |
subset |
index the rows of the data frame that will be used to calculate the mean and covariance of the distribution (unless specified manually). |
S |
the covariance matrix used to normalise the data in the Mahalanobis calculation. Leave as NULL to use the ordinary covariance matrix calculated using cov(dfv[subset,column.nums],use="pairwise.complete.obs"). |
M |
the point that Mahalanobis distance is measured from. Leave as NULL to measure distance from the mean of dfv[subset,column.nums]. |
Under default options the standard Mahalanobis calculation is used, based on the mean and covariance matrix of the data. Addition arguments can be used to specify the mean and covariance matrix manually, or to define a subset of points that are used in the calculation. The input data frame can handle some missing data, as long as a covariance matrix can still be computed using the function cov(dfv[subset,column.nums],use="pairwise.complete.obs").
Robert Verity r.verity@imperial.ac.uk
1 2 3 4 5 6 7 8 9 10 11 12 | ## Not run:
#' # create a matrix of observations
df <- data.frame(x=rnorm(100),y=rnorm(100))
# calculate Mahalanobis distances
distances <- Mahalanobis(df)
# use this distance to look for outliers
Q95 <- quantile(distances, 0.95)
which(distances>Q95)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.