lfq_data: Create a lineage frequency data object

View source: R/lfq-data.R

lfq_dataR Documentation

Create a lineage frequency data object

Description

Validates, structures, and annotates lineage count data for downstream modeling and analysis. This is the entry point for all lineagefreq workflows.

Usage

lfq_data(
  data,
  lineage,
  date,
  count,
  total = NULL,
  location = NULL,
  min_total = 10L
)

Arguments

data

A data frame containing at minimum columns for lineage identity, date, and count.

lineage

<tidy-select> Column containing lineage/variant identifiers (character or factor).

date

<tidy-select> Column containing collection dates (Date class or parseable character).

count

<tidy-select> Column containing sequence counts (non-negative integers).

total

<tidy-select> Optional column of total sequences per date-location. If NULL, computed as the sum of count per group.

location

<tidy-select> Optional column for geographic stratification.

min_total

Minimum total count per time point. Time points below this are flagged as unreliable. Default 10.

Details

Performs the following validation and processing:

  1. Checks that all required columns exist and have correct types.

  2. Coerces character dates to Date via ISO 8601 parsing.

  3. Ensures counts are non-negative integers.

  4. Replaces NA counts with 0 (with warning).

  5. Aggregates duplicate lineage-date rows by summing (with warning).

  6. Computes per-time-point totals and frequencies.

  7. Flags time points below min_total as unreliable.

  8. Sorts by date ascending, then lineage alphabetically.

Value

An lfq_data object (a tibble subclass) with standardized columns:

.lineage

Lineage identifier (character).

.date

Collection date (Date).

.count

Sequence count (integer).

.total

Total sequences at this time point (integer).

.freq

Observed frequency (numeric).

.reliable

Logical; TRUE if .total >= min_total.

.location

Location, if provided (character).

All original columns are preserved.

Examples

d <- data.frame(
  date = rep(seq(as.Date("2024-01-01"), by = "week",
                 length.out = 8), each = 3),
  lineage = rep(c("JN.1", "KP.3", "Other"), 8),
  n = c(5, 2, 93, 12, 5, 83, 28, 11, 61, 50, 20, 30,
        68, 18, 14, 80, 12, 8, 88, 8, 4, 92, 5, 3)
)
x <- lfq_data(d, lineage = lineage, date = date, count = n)
x


lineagefreq documentation built on April 3, 2026, 9:09 a.m.