Description Usage Arguments Details Value See Also Examples
View source: R/test_arguments.R
Test the performance of a prediction algorithm over a range of argument values. Multiple arguments can be tested simultaneously.
1 | test_arguments(pred_fun, df_train, df_test, diagnostic_fun, arguments)
|
pred_fun |
The prediction algorithm to be tested.
It should be a function with formal arguments |
df_train |
training data |
df_test |
testing data |
diagnostic_fun |
the criteria with which the predictive performance will be assessed |
arguments |
named list of arguments and their values to check |
For each combination of the supplied argument levels, the value of
pred_fun()
is combined with df_test
using cbind()
,
which is then passed into diagnostic_fun()
to compute the diagnostics.
Since the number of columns in the returned value of pred_fun()
is arbitrary,
one can test both predictions and uncertainty quantification of the predictions
(e.g., by including prediction standard errors or predictive interval bounds)
an object of class 'testargs'
containing all information from the testing procedure
plot_diagnostics
, optimal_arguments
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 | library("testarguments")
## Simulate training and testing data
RNGversion("3.6.0"); set.seed(1)
n <- 1000 # sample size
x <- seq(-1, 1, length.out = n) # covariates
mu <- exp(3 + 2 * x * (x - 1) * (x + 1) * (x - 2)) # polynomial function in x
Z <- rpois(n, mu) # simulate data
df <- data.frame(x = x, Z = Z, mu = mu)
train_id <- sample(1:n, n/2, replace = FALSE)
df_train <- df[train_id, ]
df_test <- df[-train_id, ]
## Algorithm that uses df_train to predict over df_test. We use glm(), and
## test the degree of the regression polynomial and the link function.
pred_fun <- function(df_train, df_test, degree, link) {
M <- glm(Z ~ poly(x, degree), data = df_train,
family = poisson(link = as.character(link)))
## Predict over df_test
pred <- as.data.frame(predict(M, df_test, type = "link", se.fit = TRUE))
## Compute response level predictions and 90% prediction interval
inv_link <- family(M)$linkinv
fit_Y <- pred$fit
se_Y <- pred$se.fit
pred <- data.frame(fit_Z = inv_link(fit_Y),
upr_Z = inv_link(fit_Y + 1.645 * se_Y),
lwr_Z = inv_link(fit_Y - 1.645 * se_Y))
return(pred)
}
## Define diagnostic function. Should return a named vector
diagnostic_fun <- function(df) {
with(df, c(
RMSE = sqrt(mean((Z - fit_Z)^2)),
MAE = mean(abs(Z - fit_Z)),
coverage = mean(lwr_Z < mu & mu < upr_Z)
))
}
## Compute the user-defined diagnostics over a range of argument levels
testargs_object <- test_arguments(
pred_fun, df_train, df_test, diagnostic_fun,
arguments = list(degree = 1:6, link = c("log", "sqrt"))
)
## Visualise the performance across all combinations of the supplied arguments
plot_diagnostics(testargs_object)
## Focus on a subset of the tested arguments
plot_diagnostics(testargs_object, focused_args = "degree")
## Compute the optimal arguments for each diagnostic
optimal_arguments(
testargs_object,
optimality_criterion = list(coverage = function(x) which.min(abs(x - 0.90)))
)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.