docs/incidence-computation.md

output: word_document: default html_document: default pdf_document: default

InfluenzaNet Incidence Computation

Author: Clément Turbelin clement.turbelin@sorbonne-unversite.fr Version: 1.1, 4 Nov 2019

Data & Variables used to compute Incidence

External Data

@see DataAnalysisGuidelines.md

Data Loading

Incidence are computed by season, considering the season`s data (from the last september of the season) Variables will be named in human readable & meaningful variable names, corresponding variable db column name in indicated aside each variable

Intake survey variables:

For each participant, the last available intake survey response for the season is considered (only one age and location is considered for a participant during a season)

Across platforms differences:

For some countries data are stored in the same table and the is no intake for the current season for some participants, Intake can be loaded from previous seasons but a limit should be used (to be sure very old data are not used) or participants without an intake during the season should be excluded (TBD)

This is especially the case for IT, ES and UK (2015) counting season from october to april.

Weekly survey variables:

Variable Recoding

Recode some variables to make error-proof coding and recode to Missing value inconsistent values (date in the future)

weekly:

intake:

Remarks:

inconsistency of date.birth is not checked here, should be (negative and too old people can occur) Inconsistency of age was checked for syndromic classification but not for age-group stratification (should so) inconsistency of sympt.start and fever.start before the survey is not checked here (but they are excluded during computation if these date are outside from the computing period)

Syndromic classification

Each survey is evaluated to fit a syndrome definition Consider one boolean column (0/1) for each syndrome type (corresponding to one definition), assigned to each survey response

For each participants, consider age of the last available intake survey [TBD]

Any symptome declared as sudden

is_sudden = (sympt.sudden not missing and sympt.sudden is "Yes") OR (fever.sudden not missing and fever.sudden is "Yes")

Pain is only accounted if age over 5 (< 120 to exclude inconsistency)

has_pain = if age > 5 and age < 120 use pain value else consider it`s True

Q6d coding (highest.temp)

Fever over 39

fever_level_39 = highest.temp not missing and is 4 or 5 (6 is recoded to missing)

Fever over 38

fever_level_38 = highest.temp not missing and is 3, 4 or 5 (6 is recoded to missing)

General set of symptoms for ARI

general_ari = any_of[fever, chills, asthenia, headache ] OR has_pain

Syndromes definitions:

Remarks: Differences with written definition and last implementation:

Data preparation

Available Parameters:

1. Exclude same rule [ param=exclude.same.delay ]

delay = number of days (computed on truncated date)

if same.episode is Yes and previous survey has delay < exclude.same.delay cancel syndrome report (consider syndrome is not incident)

2. Compute onset column

onset = first available date from fever.start, sympt.start, survey date

incidence week = ISO 8601 year week of the onset (caution use the year of the week, not the year of the date strftime %G%V), we use a numeric encoding year * 100 + week number, but date of the monday of the week

3. Aggregate syndromes count by week,participant, counting only 1 syndrome kind by person-week

(so for each syndrome = if syndrome > 0 then 1 else 0)

4. Compute season-wide data by participants (needed to apply selection rules for each week)

Incidence computation for a given week yw (year-week) in a given season.

Active participants selection

For a given week, computation has two steps:

Incidence computation for the week yw and a given syndrome definition

Computation can be done using a set of strata (for example age-group, regions)

  1. With select active participants: compute active count for the week by strata
  2. With weekly surveys for which onset week is equal to the currently computed week yw and participants is active for the week
  3. Count the number of participants with the syndrome by strata
  4. At this step, you should have in each strata, syndrome count and active participants for the week yw

Crude incidence rate

Crude incidence = total count of participants with the syndrome for the week yw/ total active participants (sum in all strata) at the week yw Confidence interval bounds is the poisson exact IC95% computed on total active participants of the week yw

Adjusted incidence rate by strata

In each strata:

Adjusted incidence = sum(rate) over all strata

Confidence interval is computed using DKES estimated for adjuster ratio (Fay & Feuer, 1997, Stat In Med (16) p791-801)



cturbelin/ifnBase documentation built on Nov. 5, 2023, 12:54 p.m.