In smitdave/MASH: MASH (Modular Analysis and Simulation for Human Health)

Simulation

Practical limits on computation, and the problems of scaling from individuals up to continents call for sensible approaches to the quantification of parasite mobility.

Time at Risk : From Individuals to Populations

Let $\left{h\right}$ enumerate the humans in some defined geographical region. We want to define an activity space algorithm $A_i$ for each $h_i$ for purposes of simulation that describes their average time at risk among all the points on the landscape where transmission occurs $\left{f\right}$. The following is a proposed algorithm for doing this.

For practical reasons, we presume time at risk is related to time spent, so that points in $\left{f\right}$ are (for the most part) places where humans live. In particular, we could define this for a grid. We define a kernel that weights every point in $\left{f\right}$ by distance and other features: $K = K_0 + K_1 + K_2$, where $K_0$ is home and the proportion of time at risk spent at home (or in the home patch) is $P_0$, $K_1$ is the pair of patches and proportion of time at risk in $\left{f\right}$ during routine mobility sums to $P_1$, and $K_2$ is the pair of patches and proportion of time at risk in $\left{f\right}$ during travel sums to $P_2 = 1-P_0-P_1$. We can use any functional forms we want for $K_1$ or $K_2$, choosing them to fit the data we have.

Whereas $K_i$ defines a set of weak propensities for the human $h_i$ to travel among points in $\left{f\right}$, we assume a large fraction of human mobility is characterized of habitual movements -- strong propensities to visit a few places ($n$). To characterize these propensities, for each human we draw a set of points consistent with the average propensities determined by $K$.

From $K_1 + K_2$, compute a weight on each point $f_k$ that is not home and normalize to get $w_k$.
Order the weights, and produce a CDF.
Draw $n$ uniform random variates $v_j$ and invert the CDF to choose locations, $\left{j \right} \in \left{f\right}$
We also use the $v_j$ to assign the proportion of time at risk that is (on average) consistent with the kernels. WLOG, let the $v_j$ represent an ordered list, and Let $t_j = (v_{j+1} - v_{j})/2 - (v_{j}- v_{j-1})/2 = (v_{j+1} - v_{j-1})$ be the set of distances between their midpoints (or 0 or 1 for the lowest and highest j).
The activity space $A_i$ for the $i^{th}$ individual is the index of home and $P_0$, list $j$ and $t_j$, where $\sum t_j = P_1 + P_2$.
We may also want to assume that $A_i$ represents a generalized notion associated with the risk of predictable travel, but that there is some (small) residual risk associated with other travel, represented by another functional form that is easy to compute.

For each $h_i$ this gives us an activity space, and we can then compute the connectivity matrix $\psi$ for any arbitrary set of patches containing $\left{f\right}$ by ordering the $h$ such that all the humans living in the same patches defined by $\left{f\right}$ or any grouping of $\left{f\right}$ are contiguous. Each row in a large matrix is the activity space algorithm $A_i$. The elements of $\psi$ are the sums of the blocks of the connectivity matrix given by the rows in $i$.

Time at Risk : A Practical Approach

A problem for us is that unless we wish to model the whole globe all at once, our defined geographical region is not ``closed:'' there is some time at risk spent outside the region. We let $\delta$ denote the total force of infection from importation. This trick can actually serve us well in other ways.

Human population distributions have spatial structure, and they are naturally aggregated into clusters. Large cities and urban areas -- highly densely populated areas -- prsent a distinct problem from rural towns, as do the isolated houses away from settlements. A closely associated problem is that our data were often aggregated into some region (polygon) that defines populations in a way that are not necessarily good for simulating malaria. Our data sometimes come as the specific locations of households, or as gridded data, populations around mobile phone towers, or a country's administrative units. We would like to develop methods allowing us to simulate malaria on whatever aggregation of populations (planar tilings) we choose that are all internally consistent. In particular, knowing how to scale up from individuals to populations can (perhaps) give us some indication of how to scale down from arbitrary tilings down to individuals and (perhaps) back up to other tilings.

The idea here is quite simple: that we can scale down to the smallest set of tilings (i.e. the gridded map) and scale up to the tiles defined by the union of all tiling boundaries, and thence up to the sets of populations as they come to us. Using principles described above, this works, in principle, for individual households. By running the model on an ensemble of patches, we can get a set of internal checks.

This works, in principle for large cities as well, though proximity, population density, and the degree of interconnectedness suggests we may wish to consider tiling the city distribution by aggregating grids up (i.e. $2 \times 2$, $4 \times 4$, $8 \times 8$, $16 \times 16$, \ldots).

Scaling the Activity Space

We start with an activity space matrix $A$, where $A_{i,j}$ is the proportion of time at risk from $i$ to $j$. Each patch has a population density, $H_i$, a center, $x_i,y_i$ of it's population mass, and other relevant features $\vec \phi_i$. Each patch also has some relevant malaria properties describing malaria transmission (see below). The distance matrix $D$ for these centers has pairwise distance elements $d_{i,j}$. To split the $i^{th}$ patch, into two patches, $P_a$ and $P_b$, we need a new connectivity matrix, $A'$, where the $i^{th}$ row and column are replaced by two new rows and columns. We will also need to know: the new population densities, $H_1$ and $H_2$; the features of each new patch $\vec \phi_1$ and $\vec \phi_2$; and the center of population mass is $(x_1,y_1)$ and $(x_2,y_2)$. That is, we are create an algorithm, $F$ to make a new $A'$ from $A$: $A' = F(i, f, A, P_i, P_a', P_b')$. We impose a rule that the quantities are conserved, so that if we reversed the process to join two patches, adding rows and columns (without knowing $f$), we would get back to $A = F^{-1}(A',P_1',P_2')$.

Note, that there is an asymmetry here in that splitting a cell requires more information than joining one: $A'$ has more information in it than $A$. To be able to deal with this uncertainty and the asymmetry of scaling down, we want to use a general function describing time spent from $i$ to $j$, $f(d,H_i,H_j,\phi_i,\phi_j)$, so that we can deal with uncertainty in travel patterns at lower scales.

There are three tasks: internal connectivity (splitting $A_{i,i}$ into a $2 \times 2$ connectivity matrix to replace it); recomputing connectivity {\em from} the $i^{th}$ patch to every other patch (splitting row of A to two rows in A'); and recomputing connectivity {\em to} the $i^{th}$ patch from every other patch (splitting a column of A to make two columns in A'). Abusing notation a bit, the conservation condition implies that for each new element of the first new row of $A'$:

$$A_{i,a}' = \frac{ f(d_{a,i}, H_{i}, H_a, \phi_i, \phi_a) }{ f(d_{a,i}, H_{i}, H_a, \phi_i, \phi_a) + f(d_{a,i}, H_{i}, H_b, \phi_i, \phi_b)} A_{i,i} $$

and each element of the first new column of $A'$ obeys the rule:

$$A_{a,i}' = \frac{f(d_{a,i}, H_{a}, H_{i}, \phi_a, \phi_i)}{f(d_{a,i}, H_{a}, H_{i}, \phi_a, \phi_i) +f(d_{b,i}, H_{b}, H_{i}, \phi_b, \phi_i)} A_{i,i}.$$

Internal travel must partition time spent in one patch into time spent in each patch by each population while following the same rules. Note that (droping the $i$ index) we have to create a $2 \times 2$ matrix describing time spent in each patch by members of each patch, and: $$A_{i,i} H = (A_{a,a}' + A_{a,b}') H_a + (A_{b,a} + A_{b,b}) H_b.$$

Following the pattern above, we get that: $$A_{a,a}' = \frac{f(0, H_a, H_b, \phi_a, \phi_b)}{f(0, H_a, H_b, \phi_a, \phi_b) + f(d_{a,b}, H_a, H_b, \phi_a, \phi_b)} A_{i,i}$$ and $$A_{a,b}' = \frac{f(d_{a,b}, H_a, H_b, \phi_a, \phi_b)}{f(0, H_a, H_b, \phi_a, \phi_b) + f(d_{a,b}, H_a, H_b, \phi_a, \phi_b)} A_{i,i}$$

The information symmetry is more difficult than it would seem, because splitting a cell requires knowing not only the population densities, but also the locations of the new cell centers, a population-weighted location:

$$\begin{array}{rl}x_i &= (H_a x_a + H_b x_b)/H_i \ y_i &= (H_a y_a + H_b y_b)/H_i\end{array}$$

Similarly, the features of the landscape must scale following similar population-weighted or area-weighted rules.

Scaling Transmission

In scaling down a transmission model, we must assign new reproductive numbers, $R_a$ and $R_b$, (including vectorial capacity and health seeking) to each cell in a way that conserves internal values. Ideally, we would like to have that

$$H_i x_i = H_a x_a + H_b x_b$$

Since in general, the ratio $A_{j,a} / A_{j,b}$ will be different for each $j$, but there are only two new quantities, $x_a$ and $x_b$, it will not be possible to satisfy this relationship generically. Instead, we will be satisfied to rescale the $\left{R_j\right}$ to hold prevalence constant everywhere.

Exercise: Scaling Down Patches to Patches

Develop the algorithm for scaling from one set of large, defined patches defined with grid-based population estimates to a different set of defined patches (i.e. A3 to Wenger Populations).

Automated Scaling Down

Develop the algorithm for scaling down in an automated way by defining populations based on human population density and malaria maps.

Malaria Connectivity

Scaling Malaria Connectivity

We now wish to scale down malaria connectivity using the scaled down human movement. Since time at risk scales appropriately with the new activity space matrix $A'$, conserving malaria transmission properties must follow the constraint that the daily force of infection scales linearly as well: $h_i = h_1' + h_2'$, and if it is possible, that the number of infected people scales approximately linearly, as well: For modeling transmission dynamics and control, we must find scaling rules for the parameters describing the underlying transmission process. To do this, we let $R_{0,i}$ and $R_{C,i}$ denote the local reproductive numbers, and $\alpha_{i}$ the proporion of cases treated, and $\vec \eta_i$ the probability distribution function of where those. We want to obtain new values for each one of the new patches | $R_{0,\cdot}'$, $\alpha_\cdot'$, $R_{C,\cdot}'$, $\vec \eta_\cdot'$, ($\cdot = 1,2$)| that fit the data we might have when aggregated in the new patches. For this, we must specify a transmission model.

Ross-Macdonald Model

{\bf NOTE: Apologies for the notational overlap. Think "scoping" for notation.}

\vspace{0.1in}

Given the matrix $A$ we can write out the equations for the Ross-Macdonald equations. Let $\vec H$ denote the population density in each patch, $\vec X$ the density of infected and infectious humans, $\vec R_C$ the reproductive numbers, $c$ and $S$ the parameters describing infectivity of humans and the stability index, and $r$ the recovery rate. Let $\vec x = \vec X/\vec H$, and let $\vec {H'} = A^T H$ and $\vec {X'} = A^T X$, and $\vec {x'} = \vec{X'}/\vec{H'}$, then we get the following equation for changing prevalence: % $$ \frac{1}{r} \frac {d \vec x}{dt}=A \left(I \frac{ \vec{H} \vec{x'}} {\vec{H'}(1+cS\vec{x'})}\vec R_c\right)(1-\vec x) - \vec x$$

smitdave/MASH documentation built on May 30, 2019, 5:02 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Tweet to @rdrrHQ

GitHub issue tracker

ian@mutexlabs.com