knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/README-" )
recorder
is a lightweight toolkit to validate new observations when computing
their corresponding predictions with a predictive model.
With recorder
the validation process consists of two steps:
There can be many data specific reasons, why you might not be confident in the predictions of a predictive model on new data.
Some of them are obvious, e.g.:
Others are more subtle, for instance if it is the case, that observations in new data are not within the "span" of the training data.
If one or more of the recorder
validation tests fail on new data, you might not
be confident in the corresponding predictions.
recorder
can be installed from CRAN with install.packages('recorder')
.
If you want the development version then install directly from GitHub:
devtools::install_github("smaakage85/recorder")
Get ready by loading the package.
library(recorder)
The famous iris
dataset will be used as an example. The data set is divided
into training data, that can be used for model development, and new data for
predictions after modelling, which can be validated with recorder
.
set.seed(1) trn_idx <- sample(seq_len(nrow(iris)), 100) data_training <- iris[trn_idx, ] data_new <- iris[-trn_idx, ]
Record statistics and meta data of the training data with record()
.
tape <- record(data_training)
Run validation tests on new data with play()
.
playback <- play(tape, data_new)
Print the over-all results of the validation tests.
playback
The test summary tells us, that one observation (row #11) has a value of the variable "Petal.Length" outside the recorded range in the training data; hence we might not be confident in the prediction of this particular observation.
After running the validation tests, you can extract the results of (any) failed
tests for the rows/observations of new data with get_failed_tests()
.
failed_tests <- get_failed_tests(playback) # print. library(knitr) kable(head(failed_tests, 15))
You might also find the functions get_failed_tests_string()
and
get_clean_rows()
to be useful.
That is basically it. If you to know more about all of the exciting features
of recorder
, take a look at the vignette.
Also, if you have any feedback on the package, please let me hear from you.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.