README.md

whobcnapp

Lifecycle:
experimental Codecov test
coverage

whobcnapp is an R package containing utilities for both (a) transforming data per the specifications of the “Data Assessment Report” and and (b) visualizing that data in the form of a production-ready shiny web application.

Installation

You can install the most recent version of whobcnapp from GitHub with the following:

# install.packages("devtools")
devtools::install_github('databrew/whobcnapp')

Alternatively, one can build the package locally by first cloning the https://github.com/databrew/whobcnapp repository, then running the build_package.R script.

Usage

Data transformation utilities

Overview

The functions for “cleaning” data are customized for the formats currently used by the office. In general terms, the work flow is as follows:

For further details on the rationale and formats of these transformations, see the “UHC Policy Watch Platform Design: DATA ASSESSMENT REPORT”.

Walk-through

One starts by preparing the workspace with the necessary packages:

library(tidyr)
library(dplyr)
library(lubridate)
library(readxl)
library(whobcnapp)

Next, one must define the input file to be transformed. For the purpose of this example, we’ll use BUL_Appendix_tables.xlsx which is available at https://github.com/databrew/whobcnapp/blob/master/data-raw/BUL_Appendix_tables.xlsx.

input_file <- 'data-raw/BUL_Appendix_tables.xlsx'

Having defined the path to the data, one uses the function conformity_check to ensure that the data to be read conforms to the requirements of the ETL process:

ok <- conformity_check(input_file)

If the file’s format is correct, the function will return a boolean TRUE, and print a message saying:

The file conforms. OK to go on.

Otherwise, the function will return the nature of the non-conformity. For example:

conformity_check('xyz_tables.xlsx')
Error in conformity_check("xyz_tables.xlsx") : 
  The file should be named "ISO_Appendix_tables.xlsx", where "ISO" is the 3 letter ISO code of the country in question.

Having confirmed that the file to be read is of the correct format, one can use the read_data function to load each sheet of the Excel workbook into memory:

data <- read_data(input_file)

The result of the read_data function is a list of dataframes, in which the name of each element in the list reflects the figure for which the data corresponds.

cat(paste0(names(data), collapse = '\n'))
#> fig1
#> fig2
#> fig3
#> fig3a
#> fig4
#> fig5
#> t2a
#> t2b
#> t2c

Data is structured uniformly into a “long” format, following the principles of “tidy data”. All dataframe share the following columns:

Whereas some dataframes have additional columns such as:

Below are the column names and first 3 rows of each dataset as an example:

for(i in 1:length(data)){
  print(names(data)[i])
  print(head(data[[i]], 3))
}
#> [1] "fig1"
#> # A tibble: 3 x 5
#>   quintile year   value iso   indicator                                         
#>   <fct>    <chr>  <dbl> <chr> <chr>                                             
#> 1 Poorest  2005  0.0468 BUL   Out-of-pocket payments for health care as a share…
#> 2 2nd      2005  0.0625 BUL   Out-of-pocket payments for health care as a share…
#> 3 3rd      2005  0.0593 BUL   Out-of-pocket payments for health care as a share…
#> [1] "fig2"
#> # A tibble: 3 x 6
#>   grp        year    value quintile iso   indicator                             
#>   <chr>      <chr>   <dbl> <fct>    <chr> <chr>                                 
#> 1 Medicines  2005  0.879   Poorest  BUL   Breakdown of out-of-pocket spending b…
#> 2 MedicalPr… 2005  0.00864 Poorest  BUL   Breakdown of out-of-pocket spending b…
#> 3 Outpatien… 2005  0.0298  Poorest  BUL   Breakdown of out-of-pocket spending b…
#> [1] "fig3"
#> # A tibble: 3 x 5
#>   grp            year   value indicator                                    iso  
#>   <chr>          <chr>  <dbl> <chr>                                        <chr>
#> 1 FurtherImpove… 2005  0.0491 Share of households at risk of impoverishme… BUL  
#> 2 Impoverished   2005  0.0247 Share of households at risk of impoverishme… BUL  
#> 3 AtRiskOfImpov… 2005  0.0801 Share of households at risk of impoverishme… BUL  
#> [1] "fig3a"
#> # A tibble: 3 x 5
#>   grp            year  value indicator                                     iso  
#>   <chr>          <chr> <dbl> <chr>                                         <chr>
#> 1 FurtherImpove… 2005    424 Households at risk of impoverishment after o… BUL  
#> 2 Impoverished   2005    130 Households at risk of impoverishment after o… BUL  
#> 3 AtRiskOfImpov… 2005    550 Households at risk of impoverishment after o… BUL  
#> [1] "fig4"
#> # A tibble: 3 x 5
#>   quintile year   value iso   indicator                                         
#>   <fct>    <chr>  <dbl> <chr> <chr>                                             
#> 1 Poorest  2005  0.505  BUL   Share of households with catastrophic spending by…
#> 2 2nd      2005  0.181  BUL   Share of households with catastrophic spending by…
#> 3 3rd      2005  0.0889 BUL   Share of households with catastrophic spending by…
#> [1] "fig5"
#> # A tibble: 3 x 6
#>   grp        year    value quintile iso   indicator                             
#>   <chr>      <chr>   <dbl> <fct>    <chr> <chr>                                 
#> 1 Medicines  2005  0.868   Poorest  BUL   Breakdown of catastrophic spending by…
#> 2 MedicalPr… 2005  0.00786 Poorest  BUL   Breakdown of catastrophic spending by…
#> 3 Outpatien… 2005  0.0265  Poorest  BUL   Breakdown of catastrophic spending by…
#> [1] "t2a"
#> # A tibble: 3 x 4
#>   quintile year  value indicator                                    
#>   <chr>    <chr> <dbl> <chr>                                        
#> 1 Poorest  2005   43.3 Mean annual per capita OOP by income quintile
#> 2 2nd      2005   86.0 Mean annual per capita OOP by income quintile
#> 3 3rd      2005   99.6 Mean annual per capita OOP by income quintile
#> [1] "t2b"
#> # A tibble: 3 x 4
#>   grp              year  value indicator                              
#>   <chr>            <chr> <dbl> <chr>                                  
#> 1 Medicines        2005  79.7  Mean annual per capita OOP by structure
#> 2 Medical products 2005   4.03 Mean annual per capita OOP by structure
#> 3 Outpatient care  2005   4.18 Mean annual per capita OOP by structure
#> [1] "t2c"
#> # A tibble: 3 x 5
#>   grp             year  value indicator                           quintile      
#>   <chr>           <chr> <dbl> <chr>                               <fct>         
#> 1 Medicines       2005  78.6  Mean annual per capita OOP by stru… Total populat…
#> 2 Medical produc… 2005   3.98 Mean annual per capita OOP by stru… Total populat…
#> 3 Outpatient care 2005   4.12 Mean annual per capita OOP by stru… Total populat…

Data visualization utilities

Having transformed data to a tidy, machine-readable format, standardized visualizations can be constructed. In the whobcnapp library, this functionality is indicated by functions with the plot_ prefix. These functions take a data object (as generated by read_data) and return a ggplot2() object.

Walk-through

First, we’ll transform data so as to prepare it for plotting (covered in previous section).

# Define the path to the input file
input_file <- 'data-raw/BUL_Appendix_tables.xlsx'
# Confirm that the data is correctly formatted
ok <- conformity_check(input_file)
# Transform the data to a tidy/long format
data <- read_data(input_file)

Now, we can pass the data object to the plot_... function as such:

Figure 1
plot_fig1(data = data)

Figure 2
plot_fig2(data = data)

Figure 3
plot_fig3(data = data)

Figure 4
plot_fig4(data = data)

Figure 5
plot_fig5(data = data)

T2 data
plot_t2(data = data)

Web application

Development

In a development environment, an engineer will likely carry out the following work flow:

  1. Clone this repository: https://github.com/databrew/whobcnapp
  2. cd into it cd whobcnapp
  3. Open and modify the code in R/app.R (for the shiny application itself) or in R/... (for upstream function code)
  4. cd into dev and run Rscript run_dev.R to quickly re-build and serve the application locally.

Production

To run the web application (under construction), reproduce the following code in an R session:

library(whobcnapp)
whobcnapp::run_app()


databrew/whobcnapp documentation built on Dec. 19, 2021, 8:12 p.m.