Comparing modeling engines"

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.width = 7,
  fig.height = 4
)

Overview

lineagefreq provides multiple modeling engines through the unified fit_model() interface. This vignette shows how to compare them using the built-in backtesting framework.

Three engines are available in v0.1.0:

Setup

library(lineagefreq)

Engine 1: MLR

The default engine fits a multinomial logistic regression with one growth rate parameter per non-reference lineage.

data(sarscov2_us_2022)
x <- lfq_data(sarscov2_us_2022,
              lineage = variant, date = date,
              count = count, total = total)

fit_mlr <- fit_model(x, engine = "mlr")
growth_advantage(fit_mlr, type = "growth_rate")

Engine 2: Piantham

The Piantham engine wraps MLR and translates growth rates to relative effective reproduction numbers using a specified mean generation time.

fit_pian <- fit_model(x, engine = "piantham",
                      generation_time = 5)
growth_advantage(fit_pian, type = "relative_Rt",
                 generation_time = 5)

Comparing fit statistics

glance() returns a one-row summary for each model. Since Piantham is a wrapper around MLR, the log-likelihood and AIC are identical.

dplyr::bind_rows(
  glance.lfq_fit(fit_mlr),
  glance.lfq_fit(fit_pian)
)

Backtesting

The backtest() function implements rolling-origin evaluation. At each origin date, the model is fit on past data and forecasts are compared to held-out future observations.

bt <- backtest(x,
  engines = c("mlr", "piantham"),
  horizons = c(7, 14, 21),
  min_train = 56,
  generation_time = 5
)
bt

Scoring

score_forecasts() computes standardized accuracy metrics.

sc <- score_forecasts(bt,
  metrics = c("mae", "coverage"))
sc

Model ranking

compare_models() summarizes scores per engine, sorted by MAE.

compare_models(sc, by = c("engine", "horizon"))

Visualization

plot_backtest(sc)

When to use which engine

| Scenario | Recommended engine | |----------|--------------------| | Single location, quick estimate | mlr | | Need relative Rt interpretation | piantham | | Multiple locations, sparse data | hier_mlr | | Time-varying fitness (v0.2) | garw |

Hierarchical MLR

When data spans multiple locations with unequal sequencing depth, hier_mlr shrinks location-specific estimates toward the global mean. This stabilizes estimates for low-data locations.

A demonstration requires multi-location data, which the built-in single-location dataset does not provide. See ?fit_model for an example with simulated multi-location data.



Try the lineagefreq package in your browser

Any scripts or data that you put into this service are public.

lineagefreq documentation built on April 3, 2026, 9:09 a.m.