Getting started with lineagefreq"

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.width = 7,
  fig.height = 4
)

Overview

lineagefreq models pathogen lineage frequency dynamics from genomic surveillance count data. Given a table of lineage-resolved sequence counts over time, the package estimates relative growth advantages, generates short-term frequency forecasts, and provides tools for evaluating model accuracy.

This vignette demonstrates the core workflow using simulated SARS-CoV-2 surveillance data.

Preparing data

The entry point is lfq_data(), which validates and standardizes a count table. The minimum input is a data frame with columns for date, lineage name, and sequence count.

library(lineagefreq)

data(sarscov2_us_2022)
head(sarscov2_us_2022)
x <- lfq_data(sarscov2_us_2022,
              lineage = variant,
              date    = date,
              count   = count,
              total   = total)
x

The function computes frequencies, flags low-count time points, and returns a validated lfq_data object.

Fitting a model

fit_model() provides a unified interface. The default engine is multinomial logistic regression (MLR).

fit <- fit_model(x, engine = "mlr")
fit

The print output shows each lineage's estimated growth rate relative to the pivot (reference) lineage, which is auto-selected as the most prevalent lineage early in the time series.

Extracting growth advantages

growth_advantage() converts growth rates into interpretable metrics. Four output types are available.

ga <- growth_advantage(fit,
                       type = "relative_Rt",
                       generation_time = 5)
ga

A relative Rt above 1 indicates a lineage growing faster than the reference. The confidence intervals are derived from the Fisher information matrix.

Visualizing the fit

autoplot() supports four plot types for fitted models.

autoplot(fit, type = "frequency")
autoplot(fit, type = "advantage", generation_time = 5)

Forecasting

forecast() projects frequencies forward with uncertainty quantified by parametric simulation.

fc <- forecast(fit, horizon = 28)
autoplot(fc)

Detecting emerging lineages

summarize_emerging() tests each lineage for statistically significant frequency increases.

summarize_emerging(x)

Next steps



Try the lineagefreq package in your browser

Any scripts or data that you put into this service are public.

lineagefreq documentation built on April 3, 2026, 9:09 a.m.