ATQ Guide

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  warning = FALSE, message = FALSE
)

ATQ: Using Absenteeism Data to Detect Onset of Epidemics

Introduction

The ATQ package provides tools for public health institutions to detect epidemics using school absenteeism data. It offers functions to simulate regional populations of households, elementary schools, and epidemics, and to calculate alarm metrics from these simulations.

This package builds on the work of Ward et al. and Vanderkruk et al. It introduces the Alert Time Quality (ATQ) metrics such as the Average ATQ (AATQ) and First ATQ (FATQ), to evaluate the timeliness and accuracy of epidemic alerts. This vignette demonstrates the package's use through a simulation study based on Vanderkruk et al., modeling yearly influenza epidemics and their alarm metrics in the Wellington-Dufferin-Guelph public health unit, Canada.

To use the package, install and load it with:

tryCatch({
  devtools::install_github("vjoshy/ATQ_Surveillance_Package")
}, error = function(e) {
  message("Unable to install package from GitHub. Using local version if available.")
})

library(ATQ)

The following sections will guide you through population simulation, epidemic modeling, and alarm metric calculation using the ATQ package.

Methods

ATQ provides a simulation model that consists of three sequential parts: 1) a population of individuals, 2) annual influenza epidemics, 3) school absenteeism and laboratory confirmed influenza case data. The final part of this section will include alarm metrics evaluation.

Population simulation

To simulate the population of the Wellington-Dufferin-Guelph (WDG) region in Ontario, Canada, the package offers the following functions:

If population proportions are not provided to subpop_children and subpop_noChildren, the functions will prompt the user for input.

set.seed(123)

# Simulate 16 catchments of 80x80 squares and the number of schools they contain
catchment <- catchment_sim(16, 80, dist_func = stats::rgamma, shape = 4.313, rate = 3.027)

# Simulate population size of elementary schools 
elementary<- elementary_pop(catchment, dist_func = stats::rgamma, shape = 5.274, rate = 0.014)

# Simulate households with children
house_children <- subpop_children(elementary, n = 5,
                                  prop_parent_couple = 0.7668901,
                                  prop_children_couple = c(0.3634045, 0.4329440, 0.2036515),
                                  prop_children_lone = c(0.5857832, 0.3071523, 0.1070645),
                                  prop_elem_age = 0.4976825)

# Simulate households without children using pre-specified proportions
house_noChild <- subpop_noChildren(house_children, elementary,
                                   prop_house_size = c(0.23246269, 0.34281716, 0.16091418, 0.16427239, 0.09953358),
                                   prop_house_Children = 0.4277052)

# Combine household simulations and generate individual-level data
households <- simulate_households(house_children, house_noChild)

Epidemic and Laboratory Confirmed Cases simulation

The package simulates epidemics using a stochastic Susceptible-Infected-Removed (SIR) framework. This approach differs from Vanderkruk et al., who used a spatial and network-based individual-level model.

Simulation Process

The summary and plot methods can be used to visualize and summarize the simulated epidemics:

# isolate individuals data
individuals <- households$individual_sim

# simulate epidemics for 10 years, each with a period of 300 days and 32 individuals infected initially
# infection period of 4 days 
epidemic <- ssir(nrow(individuals), T = 300, alpha = 0.298, avg_start = 45, 
                 min_start = 20, inf_period = 4, inf_init = 32, report = 0.02, lag = 7, rep = 10)

# Summarize and plot the epidemic simulation results
summary(epidemic)

plot(epidemic)

Absenteeism simulation

The compile_epi function in this code compiles and processes epidemic data, simulating school absenteeism using epidemic and individual data. It creates a data set for actual cases, absenteeism and laboratory confirmed cases, this data set will also include a "True Alarm Window", reference dates for each epidemic year and seasonal lag values.

Absenteeism data is simulated as follows:

The data is aggregated across all schools.

absent_data <- compile_epi(epidemic, individuals)

dplyr::glimpse(absent_data)

Alarm Metrics Evaluation

The eval_metrics function assesses the performance of epidemic alarm systems across various lags and thresholds using school absenteeism data. It evaluates the following key metrics:

A logistic regression model with lagged absenteeism and fixed seasonal terms given by:

[ \text{logit}(\theta_{tj}) = \beta_0 + \sum_{k=0}^l \beta_{k+1}x_{(t-k)j} + \beta_{l+2}\sin\Bigg(\frac{2\pi t^}{T^}\Bigg) + \beta_{l+3}\cos\Bigg(\frac{2\pi t^}{T^}\Bigg) + \gamma_j ]

where (\theta_{tj}) represents the probability of having at least one reported case on day ( t ) for school year ( j ). The predictor variables ( x_{(t-k)j} ) are the mean daily absenteeism percentages with lag times ranging from 0 to ( l ). To account for seasonal variations in influenza, trigonometric functions with sine and cosine terms are included, where ( t^ ) denotes the calendar day of the year when ( x_{tj} ) is observed and ( T^ = 365.25 ) days represents the length of a year. Additionally, (\gamma_j) captures random effects specific to each school year ( j ).

eval_metrics also identifies the best model parameters (lag & threshold) for each metric. The output is a list with three main components:

In the example provided, alarms are calculated for school years 2 to 10, considering lags up to 15 days and threshold values ranging from 0.1 to 0.6 in 0.05 increments. Year weights are assigned proportionally to the school year number.

# Evaluate alarm metrics for epidemic detection
# lag of 15
alarms <- eval_metrics(absent_data, maxlag = 15, thres = seq(0.1,0.6,by = 0.05))

summary(alarms$results)

# Plot various alarm metrics values
plot(alarms$metrics, "FAR")    # False Alert Rate
plot(alarms$metrics, "ADD")    # Accumulated Days Delayed
plot(alarms$metrics, "FATQ")   # First Alert Time Quality
plot(alarms$metrics, "AATQ")   # Average ATQ
plot(alarms$metrics, "WFATQ")  # Weighted FATQ
plot(alarms$metrics, "WAATQ")  # Weighted Average ATQ

# visualization of epidemics with alarms raised.
alarm_plots <- plot(alarms$plot_data)
for(i in seq_along(alarm_plots)) { 
  print(alarm_plots[[i]]) 
}

References

Vanderkruk, K.R., Deeth, L.E., Feng, Z. et al. ATQ: alert time quality, an evaluation metric for assessing timely epidemic detection models within a school absenteeism-based surveillance system. BMC Public Health 23, 850 (2023). https://doi.org/10.1186/s12889-023-15747-z



Try the ATQ package in your browser

Any scripts or data that you put into this service are public.

ATQ documentation built on Sept. 11, 2024, 8:12 p.m.