knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)

covidvirus

lifecycle License: MIT

The {covidvirus} package provides a tidy format dataset of the 2019 Novel Coronavirus COVID-19 (2019-nCoV) epidemic. The raw data pulled from the Johns Hopkins University Center for Systems Science and Engineering (JHU CCSE) Coronavirus repository.

This package was inspired by Rami Krispin's {coronavirus} package. The key difference is that Rami's package provides a dataset which must be manually updated from the package author's perspective. In this package, I've used (i.e, respectfully copied) his data retrieval code and modified it to use substantially more "tidyverse" packages such as {janitor}, {lubridate}, etc. Another key difference is the name of columns. Since I've used the package {janitor}, the column names use a 'snake' style and do not have dots '.' (opting for the underscore '_').

Installation

You can install the development version from GitHub with:

# install.packages("devtools")
devtools::install_github("nikdata/covidvirus")

Usage

This is a basic example which shows you how to solve a common problem:

library(covidvirus)

corona_virus <- get_cases(wide=FALSE)

Similar to Rami's package, the output is as follows:

head(corona_virus)
tail(corona_virus) 

Here's an example of total cases by region and type (top 10):

library(dplyr)

corona_virus %>%
  group_by(country_region, type) %>%
  summarize(total_cases = sum(cases)) %>%
  arrange(desc(total_cases)) %>%
  head(20)

To manually create a wide dataframe, you can do the following (it is recommended to use the wide=TRUE argument):

library(tidyr)

corona_virus %>% 
  filter(date == max(date)) %>%
  select(country = country_region, type, cases) %>%
  group_by(country, type) %>%
  summarize(total_cases = sum(cases, na.rm = T)) %>%
  ungroup() %>%
  pivot_wider(names_from = type,
              values_from = total_cases) %>%
  arrange(desc(confirmed)) %>%
  head(10)

Wide Dataframe

Sometimes it may be easier to have a "wide" dataframe that enables you to see the number of cases for each type in their own respective columns.

covidvirus::get_cases(wide = TRUE) %>%
  head(10)
dplyr::glimpse(covidvirus::get_cases(wide = TRUE))

Comments

I would greatly appreciate any feedback you may have. If you find a bug, please file an issue.

Finally, a HUGE thanks to Rami Krispin for creating the {coronavirus} package!



nikdata/covidvirus documentation built on April 2, 2020, 4:06 a.m.