within_n_mads: Return a function to create robust z-score checking predicate

Description Usage Arguments Details Value See Also Examples

View source: R/predicates.R

Description

This function takes one argument, the number of median absolute deviations within which to accept a particular data point. This is generally more useful than its sister function within_n_sds because it is more robust to the presence of outliers. It is therefore better suited to identify potentially erroneous data points.

Usage

1

Arguments

n

The number of median absolute deviations from the median within which to accept a datum

...

Additional arguments to be passed to within_bounds

Details

As an example, if '2' is passed into this function, this will return a function that takes a vector and figures out the bounds of two median absolute deviations (MADs) from the median. That function will then return a within_bounds function that can then be applied to a single datum. If the datum is within two MADs of the median of the vector given to the function returned by this function, it will return TRUE. If not, FALSE.

This function isn't meant to be used on its own, although it can. Rather, this function is meant to be used with the insist function to search for potentially erroneous data points in a data set.

Value

A function that takes a vector and returns a within_bounds predicate based on the MAD of that vector.

See Also

within_n_sds

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
test.vector <- rnorm(100, mean=100, sd=20)

within.one.mad <- within_n_mads(1)
custom.bounds.checker <- within.one.mad(test.vector)
custom.bounds.checker(105)     # returns TRUE
custom.bounds.checker(40)      # returns FALSE

# same as
within_n_mads(1)(test.vector)(40)    # returns FALSE

within_n_mads(2)(test.vector)(as.numeric(NA))  # returns TRUE
# because, by default, within_bounds() will accept
# NA values. If we want to reject NAs, we have to
# provide extra arguments to this function
within_n_mads(2, allow.na=FALSE)(test.vector)(as.numeric(NA))  # returns FALSE

# or in a pipeline, like this was meant for

library(magrittr)

iris %>%
  insist(within_n_mads(5), Sepal.Length)

lorenzwalthert/assertr documentation built on May 20, 2019, 4:06 p.m.