knitr::opts_chunk$set( collapse = TRUE, comment = NA, echo = TRUE, warning = FALSE, message = FALSE, error = TRUE, cache = FALSE, fig.path = "man/figures/README-", out.width = "100%" )
Chaccour and Brew
Aurresku is a Basque dance. It's intricate, complex, and meant to honor those of high stature in the community. It's also the tentative name of this project: an intricate, complex attempt at "the dance", ie the long-term effort to control COVID-19 until there's a vaccine.
In scientific lingo, we might call this project something like "Guding post-lockdown COVID-19 confinement policies through an assessment of spatial-temporal heterogeneity in population vulnerability and receptivity".
We borrow two concepts from the world of malaria research:
Receptivity: how "well-suited" an area is for high-risk cases
Vulnerability: the frequency of influx of infected individuals or groups
Understanding the concept through examples:
Neither receptivity nor vulnerability can be understood in a vacuum. Both require incorporating the concepts of susceptibility and spatial-temporal infection risk
Susceptibility: how much of the population remains uninfected and is therefore susceptible to becoming infected and infectious
Spatial-temporal risk: how much disease activity is there nearby (understanding "nearby" to be a function of actual flows of human mobility, not simple linear distance) at the time in question
We propose a risk "index", which takes into account all four of the aforementioned concepts:
Once constructed, this index serves to determine the level of "loosening" of social distancing measures so as to (i) minimize loss of life and health, (ii) maximize social and economic activity, and (iii) implement the correct degree of disease control measures (ie, robust enough to prevent contagion, but not overly robust so as to restrict human activity or lead to poor compliance).
First, we'll prepare data for analysis.
# Load the package for this project library(aurresku) # Load other useful packages library(knitr) library(tidyr) library(dplyr) library(sp) library(ggplot2) library(RColorBrewer) # Get census age data census <- aurresku::census # Get municipios spatial data municipios <- aurresku::municipios
We'll define a function for getting "receptivity" (ie, age-based risk) for each municipality. In this first iteration, we'll just set it as the percentage of people who are above age n
define_receptivity <- function(data, n){ data %>% mutate(receptive = edad >= n) %>% summarise(pop_receptive = sum(total[receptive], na.rm = TRUE), total_pop = sum(total, na.rm = TRUE)) %>% ungroup %>% mutate(p_receptive = pop_receptive / total_pop * 100) }
We then define receptivity for each municipio in Spain, with an age cut-off of, for example, 80 years:
risks <- census %>% group_by(municipio, id) %>% define_receptivity(n = 80) %>% arrange(desc(p_receptive))
Let's take a peak at the most "receptive" municipalities (ie, those whose populations are most pre-disposed to severe cases):
risks %>% head %>% kable
The below shows the distribution of percentage of people 80 or older by municipality.
ggplot(data = risks, aes(x = p_receptive)) + geom_density(fill = 'darkorange', alpha = 0.6) + theme_simple() + geom_text(data = tibble(x = c(12, 40), y = c(0.01, 0.01), label = c('Very many\nlow-receptivity municipalities\n(can have more relaxed measures)', 'Very few especially\nhigh-receptivity municipalities\n(need tighter measures)')), aes(x = x, y = y, label = label)) + labs(title = 'Distribution of receptivity by municipality', subtitle = '"Receptivity"= % of inhabitants 80 or older')
Let's think of this another way. If we created an arbitrary cut-off for relaxing certain measures at, for example, a receptivity of 25% or lower, we would see:
x = risks %>% group_by(status = ifelse(p_receptive <= 25, 'Relax', 'No-relax')) %>% summarise(municipalities = n(), population = sum(total_pop)) %>% mutate(percentage = round(population / sum(population) * 100, digits = 2)) names(x) <- Hmisc::capitalize(names(x)) x %>% kable
Let's map receptivity (again, using the 80 years cut-off example).
map <- municipios map@data <- left_join(map@data, risks, by = 'id') mycolours <- brewer.pal(8, "YlOrRd") spplot(map, 'p_receptive', par.settings = list(axis.line = list(col ="transparent")), main = "% of population 80 or over by municipality", cuts = 5, col ="transparent", col.regions = mycolours)
We can vary a bit. Let's do 70 years...
risks <- census %>% group_by(municipio, id) %>% define_receptivity(n = 70) %>% arrange(desc(p_receptive)) map <- municipios map@data <- left_join(map@data, risks, by = 'id') mycolours <- brewer.pal(8, "YlOrRd") spplot(map, 'p_receptive', par.settings = list(axis.line = list(col ="transparent")), main = "% of population 70 or over by municipality", cuts = 5, col ="transparent", col.regions = mycolours)
and 60 years...
risks <- census %>% group_by(municipio, id) %>% define_receptivity(n = 60) %>% arrange(desc(p_receptive)) map <- municipios map@data <- left_join(map@data, risks, by = 'id') mycolours <- brewer.pal(8, "YlOrRd") spplot(map, 'p_receptive', par.settings = list(axis.line = list(col ="transparent")), main = "% of population 60 or over by municipality", cuts = 5, col ="transparent", col.regions = mycolours)
Good. Now we've identified especially "receptive" populations (ie, those with an age structure that puts them at risk). Task A1 (1 of 4) done.
Time to do the other three tasks:
For vulnerability, we want to estimate how much movement there is between each municipality and other areas. We could do this with raw data from, for example, mobile networks. But until we have those data, we'll use a more simple metric: weighted-distance to nearby population centers.
Essentially, "vulnerability" means how close you are to large population centers and how large those population centers are. So, a town that is 15 minutes away from Manresa likely has some intermingling with Manresa; by the same token, a town adjoining Madrid likely has lots of intermingliing with Madrid.
Our weighting function looks like this:
weighter <- function(x){1 / (x^(1/10))} population_weighter <- function(x){x ^(1/2)} x <- 1:150 y <- weighter(x) df <- tibble(x,y) ggplot(data = df, aes(x = x, y = y)) + geom_line() + theme_simple() + labs(x = 'Distance from city', y = 'Relative weight', title = 'How we understand the importance of proximity')
We'll define vulnerability as the sum of the population of all nearby (<150km) population centers weighted by the distance. We calculate this for every municipality by creating a matrix of distances from municipality centroids to the nearby (<150km) population centers.
# Get cities locations and populations cities_sp <- aurresku::cities_sp # Define matrix of distances distances <- rgeos::gDistance(spgeom1 = municipios, cities_sp, byid = T) out <- rep(NA, ncol(distances)) for(j in 1:ncol(distances)){ this_municipality <- municipios@data$NAMEUNIT[j] distance_values <- distances[,j] weighted <- population_weighter(cities_sp@data$population) * weighter(distance_values) out[j] <- sum(weighted) } # Now, for every municipality, we have a vulnerability score map <- municipios map@data$vulnerability <- out
Having calculated vulnerability score, let's visualize
mycolours <- brewer.pal(8, "YlOrRd") spplot(map, 'vulnerability', par.settings = list(axis.line = list(col ="transparent")), main = "Vulnerability score", cuts = 8, col ="transparent", col.regions = mycolours)
Great. We have some concept of vulnerability. It doesn't stand alone well (ie, the fact that you're close to a place only matters if that place has infections), but that's what the next steps are for.
What's next? We need to assess (1) susceptibility and (1) spatial-temporal risk. Urgently
For (1), we need funds and a team to start preparations for mass serology. Sampling strategy and timing will be fundamental to getting actionable data. This will almost certainly be the most expensive, but also most important, part of this work.
For (2), we need the Ministry of Health to release municipality-level data as soon as possible. This should be granular (ie, a dataset with one row per case, with variables including municipality, date of symptom onset, age, sex, place of diagnosis).
This analysis is set-up as an R package. One can install it in one of two ways:
Clone from https://github.com/joebrew/aurresku and then build package from source (sequentially walking through the code in data-raw/update.R
and then running update.sh
from the command line).
In an R session, install the package directly from github: devtools::install_github('joebrew/aurresku')
.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.