Download the data as a SQLite database or as CSV files.

Latest data

The latest data are updated on an hourly basis. You can dowload them in several ways:

Download all-in-one

Download all the data at once as a compressed SQLite database file. The database contains two tables:

The two tables can be joined on the column id. Read more in the documentation.

| URL | Description | Format | Downloads | |-------------------------------------------------------|----------------------------------------|--------------------------|------------------| | https://storage.covid19datahub.io/latest.db.gz | Full database. Contains all the data. | GZIP | |

Download by level

Download worldwide data at 3 different levels of granularity: country-level data (level 1), state-level data (level 2), and city-level data (level 3).

| URL | Description | Format | Downloads | |-------------------------------------------------------|----------------------------------------|--------------------------|------------------| | https://storage.covid19datahub.io/level/1.csv.zip | Worldwide country-level data. | CSV -- ZIP -- GZIP | | | https://storage.covid19datahub.io/level/2.csv.zip | Worldwide state-level data. | CSV -- ZIP -- GZIP | | | https://storage.covid19datahub.io/level/3.csv.zip | Worldwide city-level data. | CSV -- ZIP -- GZIP | |

For developers. The endpoint to download data by level is:

where:

Download by country

select_country <- function(){
  country <- read.csv("https://storage.covid19datahub.io/country/index.csv", fileEncoding = "UTF-8")
  options <- paste(sprintf('<option value="%s">%s</option>', country$iso_alpha_3, country$name), collapse = "\n")
  sprintf('<select class="selectize" placeholder="Select a country" onchange="downloadTable(this, \'country\')"><option disabled selected value></option>\n%s\n</select>', options)
}
select_location <- function(){
  location <- read.csv("https://storage.covid19datahub.io/location/index.csv", fileEncoding = "UTF-8")
  location$name <- gsub("^(, )*", "", paste(sep = ", ",
    location$administrative_area_level_3,
    location$administrative_area_level_2,
    location$administrative_area_level_1))
  options <- paste(sprintf('<option value="%s">%s</option>', location$id, location$name), collapse = "\n")
  sprintf('<select class="selectize" placeholder="Select a location" onchange="downloadTable(this, \'location\')"><option disabled selected value></option>\n%s\n</select>', options)
}

Download the data for all the administrative divisions within one country.

r select_country()

For developers. The endpoint to download data by country is:

where:

The lookup table mapping countries to iso codes is available at https://storage.covid19datahub.io/country/index.<ext>

Download by location

Download the data by single location.

r select_location()

For developers. The endpoint to download data by location is:

where:

The lookup table mapping locations to ids is available at https://storage.covid19datahub.io/location/index.<ext>

Vintage data

Vintage data are immutable snapshots of the data taken each day. The vintage file on date YYYY-MM-DD contains the data that were available on that day. Typically, the data available on day $T$ include the counts up to day $T-1$ due to natural delays in reporting the data to the local authorities and different time zones worldwide.

tab <- function(start, end, version){
  if(end<start) return(NULL)
  dates <- seq(as.Date(end), as.Date(start), by = -1)
  head <- "| URL | Data Sources | Snapshot Date | Downloads | \n |-------------------------------------------------------|----------------------------------------|--------------------------|------------------|"
  if(version<3){
    tab <- sprintf("| https://storage.covid19datahub.io/%s.zip | Included in the zip folder | %s | <img src=\"https://storage.covid19datahub.io/downloads/%s.zip.svg\" onerror=\"this.src='https://img.shields.io/badge/downloads-0-blue'\"/> |", dates, dates, dates)    
  }
  else{
      tab <- sprintf("| https://storage.covid19datahub.io/%s.db.gz | [Download PDF](https://storage.covid19datahub.io/%s.pdf) &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; | %s | <img src=\"https://storage.covid19datahub.io/downloads/%s.db.gz.svg\" onerror=\"this.src='https://img.shields.io/badge/downloads-0-blue'\"/> |", dates, dates, dates, dates, dates, dates, dates, dates)
  }
  paste(head, paste0(tab, collapse = "\n"), sep = '\n')
}

r tab(start = "2023-03-01", end = Sys.Date()-1, version = 3)

Tests for Switzerland are now retrieved from the file COVID19Test_geoRegion_w. Before this update, tests for Switzerland were significantly lower, because only tests performed in hospitals were counted.

r tab(start = "2021-11-15", end = "2023-02-28", version = 3)

Version 3 is now live. This is a major update. The vintage data are now shipped in SQLite databases and the vintage data sources are reported in the corresponding PDF files.

The vintage data are now generated on the same date of the data snapshot. Before 14 November 2021, the vintage data were generated with a delay of 48 hours to make sure all the observations are complete, and we don't take snapshots of yet-not-complete data. Infact, there is a natural delay in reporting the data to the local authorities (+24h) and different time zones worldwide (+24h). This means that e.g., a vintage file for 1st November was actually generated on 3rd November, and the data between the 1st and 3rd November were filtered out. In other words, the vintage datasets before 14 November 2021 are affected by a look-ahead bias of 2 days. This is no longer the case after 14 November 2021.

See the changelog for further information on this update.

r tab(start = "2021-04-11", end = "2021-11-14", version = 2)

Due to the incresing size of the data files, we stopped providing the pre-processed data on 10 April 2021, so to improve the update and storage of the raw data. Please switch to the raw data if you are still using the pre-processed files. Pre-processed data fill missing dates in the raw data with NA values. This ensures that all locations share the same grid of dates and no single day is skipped. Then, NA values are replaced with the previous non-NA value or 0.

r tab(start = "2020-12-30", end = "2021-04-10", version = 2)

Since 2021-12-30, the datasets include the columns vaccines as described here.

r tab(start = "2020-12-12", end = "2020-12-29", version = 2)

Since 2020-12-12, policies for admin areas level 3 are inherited from the policies available for admin areas level 2 as described here.

r tab(start = "2020-04-14", end = "2020-12-11", version = 2)

r gsub("^# ", "## ", readr::read_file('../LICENSE.md'))



covid19datahub/COVID19dev documentation built on March 16, 2023, 3:22 a.m.