The assertr package supplies a suite of functions designed to verify assumptions about data early in an analysis pipeline.
See the assertr vignette or the documentation for more information
> vignette("assertr")
You may also want to read the documentation for the function that
assertr
provides:
assert
verify
insist
assert_rows
insist_rows
not_na
in_set
num_row_NAs
maha_dist
within_bounds
within_n_sds
within_n_mads
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | library(magrittr) # for the piping operator
library(dplyr)
# this confirms that
# - that the dataset contains more than 10 observations
# - that the column for 'miles per gallon' (mpg) is a positive number
# - that the column for 'miles per gallon' (mpg) does not contain a datum
# that is outside 4 standard deviations from its mean, and
# - that the am and vs columns (automatic/manual and v/straight engine,
# respectively) contain 0s and 1s only
# - each row contains at most 2 NAs
# - each row's mahalanobis distance is within 10 median absolute deviations of
# all the distance (for outlier detection)
mtcars %>%
verify(nrow(.) > 10) %>%
verify(mpg > 0) %>%
insist(within_n_sds(4), mpg) %>%
assert(in_set(0,1), am, vs) %>%
assert_rows(num_row_NAs, within_bounds(0,2), everything()) %>%
insist_rows(maha_dist, within_n_mads(10), everything()) %>%
group_by(cyl) %>%
summarise(avg.mpg=mean(mpg))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.