Data Challenge 2017 (MSc in Statistics, Imperial): mitigating epidemic disasters

Background

Stats Section

The West African Ebola virus epidemic was the most widespread outbreak of Ebola virus disease in history, with disastrous impact on local populations, and entire countries. Other recent examples of major infectious disease outbreaks are the 2015-2016 Zika epidemic, and the MERS-coronavirus epidemic.

In response, the World Health Organization is characterising the risk of member states to infectious disease threads. The primary aims are to prioritize resources for epidemic prevention, early detection, and control by evaluating areas that are most in need of resources as well as predicting the impact of infectious disease threats

Primary objective

With these aims in mind, records of over 20,000 large-scale disasters have been digitised. For the 2017 Data Challenge, epidemic outbreak counts are available from 66 countries for 5 years, 2010 to 2014.

Your primary objective is to predict the number of epidemic disasters in the same countries in the years 2015 and 2016.

Why we care

Scientifically, we are interested in identifying and understanding the interplay of factors that are predictive of epidemic disasters at a country level.

Open data for making a difference

To model vulnerability, the Word Health Organization is collecting data on a countries' capacity to respond to major public health crises. The IHR data set contains records of 11 capacity indicators per country for 6 years, 2010 to 2016, which could be used to inform a predictive model of epidemic disasters in 2015 and 2016.

Some entries are missing, that's the first challenge.

The World Bank have also made other large data sets, that describe the wealth and health of countries, openly available. There is no reason why you should limit yourself to the IHR data set in predicting country-level risk of major epidemic disease outbreaks. For this year's Data Challenge, we prepared for you several data sets, bringing the number of potential predictors to 500. This is where the challenge starts to be fun.

Combine the data, explore it, and see what is worth to include in your predictive model.

Getting started

Please follow the HowTo guides in the green project bar at the top of the page for further help. Good luck and we hope you enjoy the challenge!

Acknowledgments

Thanks go to Victor del Rio Villas at the WHO for many discussions on this topic.



olli0601/DataChallenge.2017 documentation built on May 29, 2019, 7:34 a.m.