whobcnapp is an R package containing utilities for both (a)
transforming data per the specifications of the “Data Assessment Report”
and and (b) visualizing that data in the form of a production-ready
shiny web application.
You can install the most recent version of whobcnapp from
GitHub with the following:
# install.packages("devtools")
devtools::install_github('databrew/whobcnapp')
Alternatively, one can build the package locally by first cloning the
https://github.com/databrew/whobcnapp repository, then running the
build_package.R script.
The functions for “cleaning” data are customized for the formats currently used by the office. In general terms, the work flow is as follows:
For further details on the rationale and formats of these transformations, see the “UHC Policy Watch Platform Design: DATA ASSESSMENT REPORT”.
One starts by preparing the workspace with the necessary packages:
library(tidyr)
library(dplyr)
library(lubridate)
library(readxl)
library(whobcnapp)
Next, one must define the input file to be transformed. For the purpose
of this example, we’ll use BUL_Appendix_tables.xlsx which is available
at
https://github.com/databrew/whobcnapp/blob/master/data-raw/BUL_Appendix_tables.xlsx.
input_file <- 'data-raw/BUL_Appendix_tables.xlsx'
Having defined the path to the data, one uses the function
conformity_check to ensure that the data to be read conforms to the
requirements of the ETL process:
ok <- conformity_check(input_file)
If the file’s format is correct, the function will return a boolean
TRUE, and print a message saying:
The file conforms. OK to go on.
Otherwise, the function will return the nature of the non-conformity. For example:
conformity_check('xyz_tables.xlsx')
Error in conformity_check("xyz_tables.xlsx") :
The file should be named "ISO_Appendix_tables.xlsx", where "ISO" is the 3 letter ISO code of the country in question.
Having confirmed that the file to be read is of the correct format, one
can use the read_data function to load each sheet of the Excel
workbook into memory:
data <- read_data(input_file)
The result of the read_data function is a list of dataframes, in which
the name of each element in the list reflects the figure for which the
data corresponds.
cat(paste0(names(data), collapse = '\n'))
#> fig1
#> fig2
#> fig3
#> fig3a
#> fig4
#> fig5
#> t2a
#> t2b
#> t2c
Data is structured uniformly into a “long” format, following the principles of “tidy data”. All dataframe share the following columns:
Whereas some dataframes have additional columns such as:
Below are the column names and first 3 rows of each dataset as an example:
for(i in 1:length(data)){
print(names(data)[i])
print(head(data[[i]], 3))
}
#> [1] "fig1"
#> # A tibble: 3 x 5
#> quintile year value iso indicator
#> <fct> <chr> <dbl> <chr> <chr>
#> 1 Poorest 2005 0.0468 BUL Out-of-pocket payments for health care as a share…
#> 2 2nd 2005 0.0625 BUL Out-of-pocket payments for health care as a share…
#> 3 3rd 2005 0.0593 BUL Out-of-pocket payments for health care as a share…
#> [1] "fig2"
#> # A tibble: 3 x 6
#> grp year value quintile iso indicator
#> <chr> <chr> <dbl> <fct> <chr> <chr>
#> 1 Medicines 2005 0.879 Poorest BUL Breakdown of out-of-pocket spending b…
#> 2 MedicalPr… 2005 0.00864 Poorest BUL Breakdown of out-of-pocket spending b…
#> 3 Outpatien… 2005 0.0298 Poorest BUL Breakdown of out-of-pocket spending b…
#> [1] "fig3"
#> # A tibble: 3 x 5
#> grp year value indicator iso
#> <chr> <chr> <dbl> <chr> <chr>
#> 1 FurtherImpove… 2005 0.0491 Share of households at risk of impoverishme… BUL
#> 2 Impoverished 2005 0.0247 Share of households at risk of impoverishme… BUL
#> 3 AtRiskOfImpov… 2005 0.0801 Share of households at risk of impoverishme… BUL
#> [1] "fig3a"
#> # A tibble: 3 x 5
#> grp year value indicator iso
#> <chr> <chr> <dbl> <chr> <chr>
#> 1 FurtherImpove… 2005 424 Households at risk of impoverishment after o… BUL
#> 2 Impoverished 2005 130 Households at risk of impoverishment after o… BUL
#> 3 AtRiskOfImpov… 2005 550 Households at risk of impoverishment after o… BUL
#> [1] "fig4"
#> # A tibble: 3 x 5
#> quintile year value iso indicator
#> <fct> <chr> <dbl> <chr> <chr>
#> 1 Poorest 2005 0.505 BUL Share of households with catastrophic spending by…
#> 2 2nd 2005 0.181 BUL Share of households with catastrophic spending by…
#> 3 3rd 2005 0.0889 BUL Share of households with catastrophic spending by…
#> [1] "fig5"
#> # A tibble: 3 x 6
#> grp year value quintile iso indicator
#> <chr> <chr> <dbl> <fct> <chr> <chr>
#> 1 Medicines 2005 0.868 Poorest BUL Breakdown of catastrophic spending by…
#> 2 MedicalPr… 2005 0.00786 Poorest BUL Breakdown of catastrophic spending by…
#> 3 Outpatien… 2005 0.0265 Poorest BUL Breakdown of catastrophic spending by…
#> [1] "t2a"
#> # A tibble: 3 x 4
#> quintile year value indicator
#> <chr> <chr> <dbl> <chr>
#> 1 Poorest 2005 43.3 Mean annual per capita OOP by income quintile
#> 2 2nd 2005 86.0 Mean annual per capita OOP by income quintile
#> 3 3rd 2005 99.6 Mean annual per capita OOP by income quintile
#> [1] "t2b"
#> # A tibble: 3 x 4
#> grp year value indicator
#> <chr> <chr> <dbl> <chr>
#> 1 Medicines 2005 79.7 Mean annual per capita OOP by structure
#> 2 Medical products 2005 4.03 Mean annual per capita OOP by structure
#> 3 Outpatient care 2005 4.18 Mean annual per capita OOP by structure
#> [1] "t2c"
#> # A tibble: 3 x 5
#> grp year value indicator quintile
#> <chr> <chr> <dbl> <chr> <fct>
#> 1 Medicines 2005 78.6 Mean annual per capita OOP by stru… Total populat…
#> 2 Medical produc… 2005 3.98 Mean annual per capita OOP by stru… Total populat…
#> 3 Outpatient care 2005 4.12 Mean annual per capita OOP by stru… Total populat…
Having transformed data to a tidy, machine-readable format, standardized
visualizations can be constructed. In the whobcnapp library, this
functionality is indicated by functions with the plot_ prefix. These
functions take a data object (as generated by read_data) and return
a ggplot2() object.
First, we’ll transform data so as to prepare it for plotting (covered in previous section).
# Define the path to the input file
input_file <- 'data-raw/BUL_Appendix_tables.xlsx'
# Confirm that the data is correctly formatted
ok <- conformity_check(input_file)
# Transform the data to a tidy/long format
data <- read_data(input_file)
Now, we can pass the data object to the plot_... function as such:
plot_fig1(data = data)

plot_fig2(data = data)

plot_fig3(data = data)

plot_fig4(data = data)

plot_fig5(data = data)

plot_t2(data = data)

In a development environment, an engineer will likely carry out the following work flow:
https://github.com/databrew/whobcnapp cd into it cd whobcnapp R/app.R (for the shiny application
itself) or in R/... (for upstream function code) cd into dev and run Rscript run_dev.R to quickly re-build and
serve the application locally.To run the web application (under construction), reproduce the following code in an R session:
library(whobcnapp)
whobcnapp::run_app()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.