README.md
In signaux-faibles/MLsegmentr: Model evaluation on data segments

segmentr

A set of tools for model evaluation, used in the project Signaux Faibles https://beta.gouv.fr/startups/signaux-faibles.html

Steps to evaluate one or several models, against one or several targets and on one or several segments: 1. Format data to be evaluated: one or several id columns, one prediction column for each model, one target column for each target, one factor column to define the segments. 2. Define an evaluation function in an eval_function object 3. Create and parametrize Assesser object 4. Assess

For instance, let us try two simple models to predict the temperature in the airquality dataset.

airq <- datasets::airquality
lin1 <- lm(Temp ~ Wind, data = airq)
lin2 <- lm(Temp ~ Wind + Solar.R, data = airq)
pred1 <- predict(lin1, airq)
pred2 <- predict(lin2, airq)

airq$model1 <- pred1
airq$model2 <- pred2

The evaluation function must be defined as an eval_function object, which takes as parameter a dataframe with columns "prediction" and "target".

For instance, let us compute residual sum of squares:

rss <- function(frame){
  se <- (frame$prediction - frame$target) ^ 2
  return(
    data.frame(
      rss = mean(se),
      rss_sigma = sd(se))
  )
}
library(MLsegmentr)
evaluation <- eval_function(eval_fun = rss)

There are two options for multi-value functions: if a list is returned, then the output of model assessment will be gathered with a column "evaluation_name" and "evaluation" if a data.frame is returned, then the output of model assessment will be spread (tidy) with same columns as the data.frame.

Optionally, one can define a plot function. To do this, following columns are available: "prediction_type", "target_type", "segment" and "evaluation", and "evaluation_name":

library(ggplot2)
plot_rss <- function(frame){
  p <- ggplot(frame, aes(
    x = prediction_type,
    y = rss,
    ymin = rss - 1.96 * rss_sigma,
    ymax = rss + 1.96 * rss_sigma,
    color = prediction_type
  )) +
    geom_point() +
    geom_errorbar() +
    facet_grid(cols = vars(segment))
  plot(p)
  return(p)
}

evaluation <- eval_function(eval_fun = rss, plot_fun = plot_rss)

my_assesser <- Assesser$new(test_data = airq)
my_assesser$evaluation_funs <- evaluation
my_assesser$set_predictions(prediction_names = c("model1", "model2"))
my_assesser$set_segments(segment_names = "Month")
my_assesser$set_targets(target_names = "Temp")

And at last: