knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/README-", out.width = "100%", message=FALSE, warning=FALSE ) library(coronavirus)
The coronavirus package provides a tidy format dataset of the 2019 Novel Coronavirus COVID-19 (2019-nCoV) epidemic. The raw data pulled from the Johns Hopkins University Center for Systems Science and Engineering (JHU CCSE) Coronavirus repository.
More details available here, and a csv
format of the package dataset available here
A summary dashboard is available here
As this an ongoing situation, frequent changes in the data format may occur, please visit the package news to get updates about those changes
Install the CRAN version:
install.packages("coronavirus")
Install the Github version (refreshed on a daily bases):
# install.packages("devtools") devtools::install_github("RamiKrispin/coronavirus")
While the coronavirus CRAN version is updated every month or two, the Github (Dev) version is updated on a daily bases. The update_dataset
function enables to overcome this gap and keep the installed version with the most recent data available on the Github version:
library(coronavirus) update_dataset()
Note: must restart the R session to have the updates available
Alternatively, you can pull the data using the Covid19R project data standard format with the refresh_coronavirus_jhu
function:
covid19_df <- refresh_coronavirus_jhu() head(covid19_df)
data("coronavirus")
This coronavirus
dataset has the following fields:
date
- The date of the summaryprovince
- The province or state, when applicablecountry
- The country or region namelat
- Latitude pointlong
- Longitude pointtype
- the type of case (i.e., confirmed, death)cases
- the number of daily cases (corresponding to the case type)head(coronavirus)
Summary of the total confrimed cases by country (top 20):
library(dplyr) summary_df <- coronavirus %>% filter(type == "confirmed") %>% group_by(country) %>% summarise(total_cases = sum(cases)) %>% arrange(-total_cases) summary_df %>% head(20)
Summary of new cases during the past 24 hours by country and type (as of r max(coronavirus$date)
):
library(tidyr) coronavirus %>% filter(date == max(date)) %>% select(country, type, cases) %>% group_by(country, type) %>% summarise(total_cases = sum(cases)) %>% pivot_wider(names_from = type, values_from = total_cases) %>% arrange(-confirmed)
Plotting the total cases by type worldwide:
library(plotly) coronavirus %>% group_by(type, date) %>% summarise(total_cases = sum(cases)) %>% pivot_wider(names_from = type, values_from = total_cases) %>% arrange(date) %>% mutate(active = confirmed - death - recovered) %>% mutate(active_total = cumsum(active), recovered_total = cumsum(recovered), death_total = cumsum(death)) %>% plot_ly(x = ~ date, y = ~ active_total, name = 'Active', fillcolor = '#1f77b4', type = 'scatter', mode = 'none', stackgroup = 'one') %>% add_trace(y = ~ death_total, name = "Death", fillcolor = '#E41317') %>% add_trace(y = ~recovered_total, name = 'Recovered', fillcolor = 'forestgreen') %>% layout(title = "Distribution of Covid19 Cases Worldwide", legend = list(x = 0.1, y = 0.9), yaxis = list(title = "Number of Cases"), xaxis = list(title = "Source: Johns Hopkins University Center for Systems Science and Engineering"))
library(plotly) p <- coronavirus %>% group_by(type, date) %>% summarise(total_cases = sum(cases), .groups = "drop") %>% pivot_wider(names_from = type, values_from = total_cases) %>% arrange(date) %>% mutate(active = confirmed - death - recovered) %>% mutate(active_total = cumsum(active), recovered_total = cumsum(recovered), death_total = cumsum(death)) %>% plot_ly(x = ~ date, y = ~ active_total, name = 'Active', fillcolor = '#1f77b4', type = 'scatter', mode = 'none', stackgroup = 'one') %>% add_trace(y = ~ death_total, name = "Death", fillcolor = '#E41317') %>% add_trace(y = ~recovered_total, name = 'Recovered', fillcolor = 'forestgreen') %>% layout(title = "Distribution of Covid19 Cases Worldwide", legend = list(x = 0.1, y = 0.9), yaxis = list(title = "Number of Cases"), xaxis = list(title = "Source: Johns Hopkins University Center for Systems Science and Engineering")) orca(p, "man/figures/total_cases.png")
Plot the confirmed cases distribution by counrty with treemap plot:
conf_df <- coronavirus %>% filter(type == "confirmed") %>% group_by(country) %>% summarise(total_cases = sum(cases)) %>% arrange(-total_cases) %>% mutate(parents = "Confirmed") %>% ungroup() plot_ly(data = conf_df, type= "treemap", values = ~total_cases, labels= ~ country, parents= ~parents, domain = list(column=0), name = "Confirmed", textinfo="label+value+percent parent")
conf_df <- coronavirus %>% filter(type == "confirmed") %>% group_by(country) %>% summarise(total_cases = sum(cases), .groups = "drop") %>% arrange(-total_cases) %>% mutate(parents = "Confirmed") %>% ungroup() p <- plot_ly(data = conf_df, type= "treemap", values = ~total_cases, labels= ~ country, parents= ~parents, domain = list(column=0), name = "Confirmed", textinfo="label+value+percent parent") orca(p, "man/figures/treemap_conf.png")
The raw data pulled and arranged by the Johns Hopkins University Center for Systems Science and Engineering (JHU CCSE) from the following resources:
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.