knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
library(jpinfect)
The jpinfect package provides tools for acquiring and processing notifiable infectious disease data in Japan. The package includes built-in datasets and functions to download, read and manipulate data from the Japan Institute for Health Security (JIHS). It also provides functions to merge datasets, transform data formats and check data sources.
This package is designed to assist researchers, epidemiologists, public health officials and developers in accessing, cleaning, and manipulating data for epidemiological analysis. The package is particularly useful for those working with infectious disease data in Japan, as it provides a streamlined process for obtaining and processing data from the JIHS.
The jpinfect package depends on the following R packages:
dplyr: for data manipulation
future: for parallel processing
future.apply: for parallel processing
httr: for HTTP requests
magrittr: for piping
readr: for reading CSV files
readxl: for reading Excel files
stats: for statistical functions
stringi: for string manipulation
stringr: for string manipulation
tidyr: for data tidying
tidyselect: for data selection
utils: for utility functions
The jpinfect package can be installed from either CRAN or GitHub using the remotes package. Through the Github repository, the latest Provisional weekly Case Reports (bullet) data can be acquired, which may not be available on CRAN. To install the package, run the following command in your R console:
From CRAN:
install.packages("jpinfect")
From GitHub (for the latest version):
if(!require("remotes")) install.packages("remotes") remotes::install_github("TomonoriHoshi/jpinfect")
Load the package after installation:
library(jpinfect)
The jpinfect package includes three built-in datasets that can be used to start immediate data analysis. These datasets are:
sex_prefecture: Confirmed weekly case reports on the sex distribution of reported cases by prefecture from 1999 to 2023.
place_prefecture: Confirmed weekly case reports about the place of infection by prefecture between 2001 and 2023.
bullet: Provisional weekly case reported by prefecture from 2024 to the current latest reports.
These datasets are provided in a tidy format, making them easy to work with using the dplyr and tidyr packages.
data("sex_prefecture") data("place_prefecture") data("bullet")
str(sex_prefecture)
str(place_prefecture)
str(bullet)
The jpinfect_merge function helps to merge the datasets into one dataset if necessary, which enables users to start their data analysis instantly.
# Load the built-in datasets data("sex_prefecture") data("place_prefecture") data("bullet") # Merge two datasets confirmed_dataset <- jpinfect_merge(sex_prefecture, place_prefecture) # Merge three datasets bind_result <- jpinfect_merge(sex_prefecture, place_prefecture, bullet)
# Check the structure of the merged dataset head(confirmed_dataset) head(bind_result)
The jpinfect_pivot function enables users to seamlessly pivot datasets between wide and long formats. This functionality is particularly useful for reorganising data to suit analysis or visualisation needs.
# Convert from wide to long format bullet_long <- jpinfect_pivot(bullet) # Convert from long to wide format bullet_wide <- jpinfect_pivot(bullet_long)
# Check the structure of long format head(bullet_long) # Check the structure of wide format head(bullet_wide)
Although the build-in datasets are provided in this package, it is ideal for scientists, epidemiologists and public health officers to review whole data handling process from the upstream to downstream. For those who cares the precision of dataset, jpinfect provides the following functions to build the same datasets or even the latest bullet datasets sourced from the government-provided raw data.
The sources of these datasets can be checked by using jpinfect_url_confirmed for confirmed case reports and jpinfect_url_bullet for provisional case reports, respectively.
# Check data source URL for sex and prefecture data jpinfect_url_confirmed(year = 2021, type = "sex") # Check data source URL for place of infection and prefecture data jpinfect_url_confirmed(year = 2021, type = "place")
The raw data can be downloaded using jpinfect_get_confirmed for confirmed case reports and jpinfect_get_bullet for provisional case reports, respectively. Confirmed weekly case data is organised into a single Microsoft Excel file for each year, while provisional data is provided as separate CSV files for each week. Since this function connect to the government website, it may take some time to download the data. To avoid excessive burden on the server, please kindly avoid downloading the files frequently. The downloaded files are saved under the raw_data folder or the specified directory.
# Download data for 2020 and 2021 jpinfect_get_confirmed(years = c(2020, 2021), type = "sex") # Download English data for weeks 1 to 5 in 2025 jpinfect_get_bullet(year = 2025, week = 1:5, dest_dir = "raw_data")
The acquired raw data into your local computer can be imported into R using jpinfect_read_confirmed and jpinfect_read_bullet.
# Read a single file dataset2021 <- jpinfect_read_confirmed(path = "2021_Syu_01_1.xlsx") # Read all files in a directory place_dataset <- jpinfect_read_confirmed(path = "raw_data", type = "place") # Read provisional data bullet <- jpinfect_read_bullet(directory = "raw_data")
If you encounter any bugs or issues while using the jpinfect package, please report them on the GitHub Issues page. When reporting, please include the following information:
A clear description of the problem
Steps to reproduce the issue
Your R version and operating system
Relevant error messages
Example code to reproduce the problem
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.