test_arguments: Test (multiple) arguments of a prediction algorithm

Description Usage Arguments Details Value See Also Examples

View source: R/test_arguments.R

Description

Test the performance of a prediction algorithm over a range of argument values. Multiple arguments can be tested simultaneously.

Usage

1
test_arguments(pred_fun, df_train, df_test, diagnostic_fun, arguments)

Arguments

pred_fun

The prediction algorithm to be tested. It should be a function with formal arguments df_train and df_test, which are data used to train the model and test out-of-sample predictive performance, respectively, as well as any arguments which are to be tested. The value of pred_fun should be a matrix-like object with named columns and the same number of rows as df_test

df_train

training data

df_test

testing data

diagnostic_fun

the criteria with which the predictive performance will be assessed

arguments

named list of arguments and their values to check

Details

For each combination of the supplied argument levels, the value of pred_fun() is combined with df_test using cbind(), which is then passed into diagnostic_fun() to compute the diagnostics. Since the number of columns in the returned value of pred_fun() is arbitrary, one can test both predictions and uncertainty quantification of the predictions (e.g., by including prediction standard errors or predictive interval bounds)

Value

an object of class 'testargs' containing all information from the testing procedure

See Also

plot_diagnostics, optimal_arguments

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
library("testarguments")

## Simulate training and testing data
RNGversion("3.6.0"); set.seed(1)
n  <- 1000                                          # sample size
x  <- seq(-1, 1, length.out = n)                    # covariates
mu <- exp(3 + 2 * x * (x - 1) * (x + 1) * (x - 2))  # polynomial function in x
Z  <- rpois(n, mu)                                  # simulate data
df       <- data.frame(x = x, Z = Z, mu = mu)
train_id <- sample(1:n, n/2, replace = FALSE)
df_train <- df[train_id, ]
df_test  <- df[-train_id, ]

## Algorithm that uses df_train to predict over df_test. We use glm(), and
## test the degree of the regression polynomial and the link function.
pred_fun <- function(df_train, df_test, degree, link) {

  M <- glm(Z ~ poly(x, degree), data = df_train,
           family = poisson(link = as.character(link)))

  ## Predict over df_test
  pred <- as.data.frame(predict(M, df_test, type = "link", se.fit = TRUE))

  ## Compute response level predictions and 90% prediction interval
  inv_link <- family(M)$linkinv
  fit_Y <- pred$fit
  se_Y  <- pred$se.fit
  pred <- data.frame(fit_Z = inv_link(fit_Y),
                     upr_Z = inv_link(fit_Y + 1.645 * se_Y),
                     lwr_Z = inv_link(fit_Y - 1.645 * se_Y))

  return(pred)
}

## Define diagnostic function. Should return a named vector
diagnostic_fun <- function(df) {
  with(df, c(
    RMSE = sqrt(mean((Z - fit_Z)^2)),
    MAE = mean(abs(Z - fit_Z)),
    coverage = mean(lwr_Z < mu & mu < upr_Z)
  ))
}

## Compute the user-defined diagnostics over a range of argument levels
testargs_object <- test_arguments(
  pred_fun, df_train, df_test, diagnostic_fun,
  arguments = list(degree = 1:6, link = c("log", "sqrt"))
)

## Visualise the performance across all combinations of the supplied arguments
plot_diagnostics(testargs_object)

## Focus on a subset of the tested arguments
plot_diagnostics(testargs_object, focused_args = "degree")

## Compute the optimal arguments for each diagnostic
optimal_arguments(
  testargs_object,
  optimality_criterion = list(coverage = function(x) which.min(abs(x - 0.90)))
)

testarguments documentation built on May 28, 2021, 9:06 a.m.