knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 7, fig.height = 4 )
lineagefreq provides multiple modeling engines through the unified
fit_model() interface. This vignette shows how to compare them
using the built-in backtesting framework.
Three engines are available in v0.1.0:
library(lineagefreq)
The default engine fits a multinomial logistic regression with one growth rate parameter per non-reference lineage.
data(sarscov2_us_2022) x <- lfq_data(sarscov2_us_2022, lineage = variant, date = date, count = count, total = total) fit_mlr <- fit_model(x, engine = "mlr") growth_advantage(fit_mlr, type = "growth_rate")
The Piantham engine wraps MLR and translates growth rates to relative effective reproduction numbers using a specified mean generation time.
fit_pian <- fit_model(x, engine = "piantham", generation_time = 5) growth_advantage(fit_pian, type = "relative_Rt", generation_time = 5)
glance() returns a one-row summary for each model. Since
Piantham is a wrapper around MLR, the log-likelihood and AIC
are identical.
dplyr::bind_rows( glance.lfq_fit(fit_mlr), glance.lfq_fit(fit_pian) )
The backtest() function implements rolling-origin evaluation.
At each origin date, the model is fit on past data and forecasts
are compared to held-out future observations.
bt <- backtest(x, engines = c("mlr", "piantham"), horizons = c(7, 14, 21), min_train = 56, generation_time = 5 ) bt
score_forecasts() computes standardized accuracy metrics.
sc <- score_forecasts(bt, metrics = c("mae", "coverage")) sc
compare_models() summarizes scores per engine, sorted by MAE.
compare_models(sc, by = c("engine", "horizon"))
plot_backtest(sc)
| Scenario | Recommended engine |
|----------|--------------------|
| Single location, quick estimate | mlr |
| Need relative Rt interpretation | piantham |
| Multiple locations, sparse data | hier_mlr |
| Time-varying fitness (v0.2) | garw |
When data spans multiple locations with unequal sequencing depth,
hier_mlr shrinks location-specific estimates toward the global
mean. This stabilizes estimates for low-data locations.
A demonstration requires multi-location data, which the built-in
single-location dataset does not provide. See
?fit_model for an example with simulated multi-location data.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.