maha_dist: Computes mahalanobis distance for each row of data frame

Description Usage Arguments Details Value See Also Examples

View source: R/row-redux.R

Description

This function will return a vector, with the same length as the number of rows of the provided data frame, corresponding to the average mahalanobis distances of each row from the whole data set.

Usage

1
maha_dist(data, keep.NA = TRUE, robust = FALSE, stringsAsFactors = FALSE)

Arguments

data

A data frame

keep.NA

Ensure that every row with missing data remains NA in the output? TRUE by default.

robust

Attempt to compute mahalanobis distance based on robust covariance matrix? FALSE by default

stringsAsFactors

Convert non-factor string columns into factors? FALSE by default

Details

This is useful for finding anomalous observations, row-wise.

It will convert any categorical variables in the data frame into numerics as long as they are factors. For example, in order for a character column to be used as a component in the distance calculations, it must either be a factor, or converted to a factor by using the stringsAsFactors parameter.

Value

A vector of observation-wise mahalanobis distances.

See Also

insist_rows

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
maha_dist(mtcars)

maha_dist(iris, robust=TRUE)


library(magrittr)            # for piping operator
library(dplyr)               # for "everything()" function

# using every column from mtcars, compute mahalanobis distance
# for each observation, and ensure that each distance is within 10
# median absolute deviations from the median
mtcars %>%
  insist_rows(maha_dist, within_n_mads(10), everything())
  ## anything here will run

assertr documentation built on June 6, 2017, 5:06 p.m.