In nathoze/Rsero: Estimate the annual force of infection using serological data

This vignette presents the different models used in the Rsero package, their implementation, and the different uses of these models to fit the data. Models of the force of infection are used to explain the change of the profile of seropositivity with age, using as the main model assumption that an increase in seroprevalence is the result of the cumulated exposure to a pathogen during a lifetime. We included in this package various models of the force of infection, but also other factors affecting the serology such as imperfect sensitivity and specificity, waning-immunity (seroreversion) and differential risks of exposure.

The data

We first load the package

library(Rsero)

The data must be in the following format

data <- simulate_SeroData(number_samples = 5)
print(data)

Models of the force of infection

The basic models

We considered five models that define the force of infection at a given year:

Annual hazard of infection. This model corresponds to independent, yearly values of the force of infection.
Outbreak model. The force of infection is defined as a sum of gaussians.
Piecewise constant model.
Constant force of infection.
Combination of constant and outbreak

In the constant model a single parameter of the force of infection is evaluated.

The $piecewise$ models consider successive constant phases. They take as an input the number $K$ of different constant phases, where $K \geq 1$. The value $K=1$ corresponds to the constant model.

In the case of the outbreak model, the force of infection at year $i$ is given as a sum of $K$ gaussians: $$\lambda_i = \sum_{j=1}^K\bar{\alpha}_{j} e^{-{(i-T_j)^2}}.$$ The $T_j$ are the time points where the force of infection is maximal and is equal to $\bar{\alpha}_j$. The parameter $\bar{\alpha}_j$ is normalized so that $\alpha_j=\bar{\alpha}_j \sum_i e^{-(i-T_j)^2}$. In the case where the $T_j$ are far apart, the gaussians are well separated and around $T_j$ the force of infection is given by a single gaussian.

Using the models

The different models are defined using the following commands

model <- FOImodel(type = "constant") # a constant model
model <- FOImodel(type = "independent") # independent annual force of infection
model <- FOImodel(type = "piecewise", K = 2) # Two constant phase 
model <- FOImodel(type = "outbreak", K = 1) # one outbreak
model <- FOImodel(type = "constantoutbreak", K = 2) # two outbreaks and a constant annual force of infection
print(model)

Printing recapitulates the main information concerning the model. For instance, in the "constantoutbreak" model with K=2, a total of 7 parameters are estimated (1 for the constant annual force of infection and 3 for each outbreak). The prior distribution for the alpha and T of each outbreak and for the annual FOI are also printed. Without additional input the defaults normal priors are used.

Secondary parameters affecting the seroprevalence

Seroreversion

Seroreversion is modeled as a rate of switching from a seropositive state to a seronegative state. In the Rsero package we implemented a combination of seroreversion with any epidemiological model.

We note $X_a(t)$ the probability that an individual born a years before the survey was seropositive $t$ years before the survey. Starting at birth, where $X_a(a) = 0$ (i.e. we assume all individuals are born seronegative), and considering $\lambda_y$ the FOI on year $y$ and the seroreversion rate $\rho$, the probability that an individual is seropositive $y-1$ years before the survey is given by the difference equation involving the probability that they are seropositive $y$ years before the survey and the probability the force of infection between year $y$ and $y-1$, considered as a constant : $$ X_a(y-1) =X_a(y) e^{-(\lambda_y+\rho)} +\frac{\lambda_y}{\lambda_y+\rho}(1-e^{-(\lambda_y+\rho)}). $$ The probability that the individual is seropositive in the survey is $P=X_a(0)$. A parameter seroreversion = 0 or seroreversion = 1 can be defined in the FOIModel. By default seroreversion = 0.

Age-dependent force of infection

We implemented models that both consider age- and time-dependency in the force of infection. In this model, the force of infection is defined by an additional parameter $C$, so that $ \lambda(y, age) = \lambda_y*e^{C(age-1)}$, so that the reference FOI is given during the first year of exposure (one-year-old children in the sample being exposed from 0 to 1 y.o), and $C>0$ indicates an increasing risk with age ($C<0$ indicates a decreasing risk with age). With the same rational as for the general seroreversion equation, the probability that an individual is seropositive in the survey is given by $P=X_a(0)$, where $X_a(a)=0$ and

$$ X_a(y-1) = 1 -(1-X_a(y)) e^{-\lambda_y e^{C(a-y)}}. $$ or equivalently, $$ P = 1-exp(-\sum_{i=0}^{a-1}\lambda_{a-i}e^{Ci}) $$ This parameter can be defined in addition to the seroreversion parameter, in which case the difference equation accounts for both terms.

A parameter age_dependent_foi = 0 or age_dependent_foi = 1 can be defined in the FOIModel. By default age_dependent_foi = 0.

Imperfect sensitivity and specificity

The models can account for uncertainty in the seroprevalence estimates resulting from imperfect sensitivity and specificity of the assays. We note $se$ and $sp$ the sensitivity and specificity of the assay, respectively. In the scenario of perfect sensitivity and specificity ($se=sp=1$) the seroprevalence is the same as the proportion of the population infected by the pathogen. In the general case, if we denote this proportion $Inf$, the probability for an individual to be observed as seropositive is $$ P=1-sp+Inf(se+sp-1) $$

In the implementation of the package, the user can give specific values to the sensitivity and specificity for any transmission models and with or without seroreversion. These parameters are fixed and must be specified by the user along with the model of the force of infection. For example, if the sensitivity is 98% and the specificity 92%, the user provides these number along with the FOI model.

model <- FOImodel(type = "outbreak", K = 1, se = 0.98, sp = 0.92) # one outbreak
print(model)

By default se = 1 and sp = 1, and the parameters se and sp must verify that $se+sp-1$.

Likelihood from the annual force of infection

The probability that an individual of age $a_j$ is infected depends on the history of the force of infection during his/her lifetime. It is therefore a function of the cumulative force of infection $\lambda_S + \lambda_{S-1}+\dots+ \lambda_{S-a_j}$, where $S$ is the sampling year. More specifically, the contribution to the likelihood of a seropositive individual is $$P_j = 1 - \exp\bigg(-\sum_{i = 0}^{a_j} \lambda_{S-i}\bigg).$$

The contribution to the likelihood of a seronegative individual is $$P_j = \exp\bigg(-\sum_{i = 0}^{a_j} \lambda_{S-i} \bigg).$$

Setting categories in the serological data

Models can be defined to include different risks of infection by category. Categories are defined in the data. Let's look at an example dataset provided with the package of a serological survey of CHIKV in Vietnam. This survey was described in Quan et al., 2018 (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5823466/) and includes 546 individuals sampled in 4 different areas of the country.
First, we load the dataset:

sero.data =  read.csv('../data-raw/CHIKVVietnam.csv', header = TRUE, sep=';')

chikv_vietnam = Rsero::SeroData(age_at_sampling = round(as.numeric(sero.data$age)), 
                                Y = as.numeric(sero.data$Y)>9,
                                sampling_year = as.numeric(sero.data$sampling_year), 
                                category = as.character(sero.data$Loc))

We could have loaded the dataset directly too

data("chikv_vietnam")

We defined categories corresponding to the different regions. This can be seen in the column category of the dataset chikv_vietnam that associates to each individual one of the four region:

print(chikv_vietnam$unique.categories) # the different categories
print(head(chikv_vietnam$category))

We define a model of the annual force of infection that accounts for different risks of infection in each region.

modelI2Full  =FOImodel(type='piecewise',
                       K=2,
                       cat_lambda = TRUE)

We defined an piecewise model, to describe that we expect a drop of the force of infection at a certain point in time. The parameter cat_lambda = TRUE is used to set different relative values of the force of infection while keeping equal the timing of the drop in infection probability. Therefore, the fit is done simultaneously on the four regions.

FOIfit.vietnam = fit(model = modelI2full, data=chikv_vietnam)

An arbitrary number of categories can be defined. In the following example we define three different categories (sex, location, occupation).

foi =0.01
ns  = 600
sex.list  = c('male', 'female')
location.list = c('Region 1', 'Region 2', "Region 3")
occupation.list = c("HCW","OTHER")
sex <- sample(sex.list,ns, replace= TRUE)
location <- sample(location.list,ns, replace= TRUE)
occupation <- sample(occupation.list, ns, replace = TRUE)

factor = 1*(sex=='males') + 0.5*(sex=='females')
factor =factor*( 1*(location=="Region 1") + 0.5*(location=="Region 2") + 3*(location=="Region 3"))
factor =factor*( 1*(occupation=="HCW") + 1.1*(occupation!="HCW") )

ages = round(runif(n=length(sex))*70)+1

data_tmp = SeroData(age_at_sampling =  ages ,
                    Y =  runif(n=ns) < 1-exp(-foi*factor*ages), 
                    sampling_year = 2023,
                    category = cbind(sex, location, occupation))

Model comparison

Different measures of model adequacy were implemented: The Pareto smoothed importance sampling leave-one-out (PSIS-lLOO) The deviance information criterion (DIC), the Akaike information criterion (AIC), the Watanabe-Akaike information criterion (WAIC). for a detailed explanation of all three measures, see Gelman A, Hwang J, Vehtari A. Understanding predictive information criteria for Bayesian models. Statistics and computing. 2014 Nov 1;24(6):997-1016 and Vehtari, A., Gelman, A., & Gabry, J. (2017). Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. Statistics and computing, 27, 1413-1432.

compute_information_criteria(FOIfit.vietnam)

modelOutbreak =FOImodel(type='outbreak',
                          K=1)
FOIfit.vietnam2 = fit(model = modelOutbreak, data=chikv_vietnam)
compute_information_criteria(FOIfit.vietnam2)

The DIC of the piecewise constant model is lower than the DIC of the outbreak model. This indicates that the former is a more adequate description of the data. In general, DIC differences of more than 5 points with the lowest DIC indicate weak support for a model.

Priors

The user can choose between a normal and an exponential distribution. Normal priors were set by default for all parameters. We give below the default values of the parameters that are mean and the standard deviation of the normal distribution, respectively.

Default prior distributions

For the constant model

priorC1 = 0.01
priorC2 = 1

For the piecewise constant model

priorC1 = 0.01
priorC2 = 1
priorT1 = 20
priorT2 = 10

For the outbreak model

prioralpha1 = 0.2
prioralpha2 = 0.2
priorT1 = 20
priorT2 = 10

For the constant+outbreak model

prioralpha1 = 0.2
prioralpha2 = 0.2
priorT1 = 20
priorT2 = 10
priorC1 = 0.01
priorC2 = 1

For the independent model

priorY1 = 0.01
priorY2 = 1

In the case of seroreversion

priorRho1 = 1
priorRho2 = 1

Specifying the parameters of the prior distributions: example

In the case of the "outbreak" and the "constantoutbreak" models each outbreak can be defined by a specific set of prior distributions. Here we set K=2. For the first outbreak, the prior for T is normal(10,30), the prior for alpha is normal(0,5). For the second outbreak, the prior for T is normal(40,70), the prior for alpha is normal(0,1).

model <- FOImodel(type = "outbreak", K = 2, 
                  prioralpha1 = c(0,0), prioralpha2= c(5, 1), 
                  priorT1 = c(10,40), priorT2= c(30, 70))
print(model)

## choosing other prior distributions

model_2 <- FOImodel(type = "outbreak", K = 1,
                    seroreversion = 1,
                    prior_distribution_alpha = "normal",
                    prior_distribution_rho = "exponential",
                    priorRho1=1)
print(model_2)

nathoze/Rsero documentation built on June 13, 2025, 10:10 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

nathoze/Rsero
Estimate the annual force of infection using serological data

In nathoze/Rsero: Estimate the annual force of infection using serological data

The data

Models of the force of infection

The basic models

Using the models

Secondary parameters affecting the seroprevalence

Seroreversion

Age-dependent force of infection

Imperfect sensitivity and specificity

Likelihood from the annual force of infection

Setting categories in the serological data

Model comparison

Priors

Default prior distributions

For the constant model

For the piecewise constant model

For the outbreak model

For the constant+outbreak model

For the independent model

In the case of seroreversion

Specifying the parameters of the prior distributions: example

R Package Documentation

Browse R Packages

We want your feedback!

nathoze/Rsero Estimate the annual force of infection using serological data

In nathoze/Rsero: Estimate the annual force of infection using serological data

The data

Models of the force of infection

The basic models

Using the models

Secondary parameters affecting the seroprevalence

Seroreversion

Age-dependent force of infection

Imperfect sensitivity and specificity

Likelihood from the annual force of infection

Setting categories in the serological data

Model comparison

Priors

Default prior distributions

For the constant model

For the piecewise constant model

For the outbreak model

For the constant+outbreak model

For the independent model

In the case of seroreversion

Specifying the parameters of the prior distributions: example

R Package Documentation

Browse R Packages

We want your feedback!

nathoze/Rsero
Estimate the annual force of infection using serological data