knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "README-" )
An in-depth examination of the effect of malaria control activities on the health and productivity of Maragra sugarcane factory workers
This code repository serves as the "research compendium" for the analysis of Maragra malaria incidence, malaria control, and worker absenteeism and productivity data. The purpose of this repository - which takes the form of a fully documented R package - is "to integrate the computations and code used in data analyses, methodological descriptions, simulations, etc. with the documents that describe and rely on them" (Gentleman and Lang, 2004).
In research, reproducibility is key to transparency and reliability. By using this package, any authorized collaborator will be able to reproduce every step of the analysis and mansucript writing by following the below instructions.
$ git clone https://github.com/joebrew/maragra
First, youĺl need to populate your data-raw
folder. This is where you should place those files which are not publicly available. These are:
├── HRS - Leave Applications.csv ├── HRS - Leave Applications - Sheet 2.csv ├── HRS - Leave Applications.xls └── maragra_monthly_data.csv
If you do not have these files but are an authorized collaborator, email joebrew@gmail.com.
credentials.yaml
folderYou'll need to send some credentials to the CISM database when accessing it. You should create a credentials.yaml
file in the credentials
folder with the following values:
dbname: openhds host: sap.manhica.net port: 4706 # (use 3306 instead if based in the CISM) user: xxx password: xxx
Note that the user
and password
fields should be changed to database credentials provided by the CISM. If you do not have these credentials but are an authorized collaborator, email joebrew@gmail.com.
The raw data is not in suitable form for analysis. By running the below script, you'll create tidy datasets from the raw data, which can then be called by simply naming them after attaching the maragra
package.
To process data, run the following in an R session from within the maragra
directory:
setwd('data-raw') source('create_data_files.R')
Alternatively, you can run this directly from bash:
$ cd data-raw
$ Rscript create_data_files.R
Having run this, you'll now have the following datasets for analysis:
d <- data(package = "maragra") d <- paste0('- ', d$results[,'Item'], ' \n') cat(d)
# !/usr/bin/R devtools::install('maragra') devtools::document('maragra')
If the above fails, it will most likely be due to R package dependencies which you do not have. Examine the error messages, and install R packages as necessary. Note that the cism
and brew
packages are available from https://github.com/joebrew, whereas the databrew
package is available from https://github.com/databrew.
As an alternative to the above, or to update documentation (including this README), run the following in an R session from within the maragra
directory:
source('build_package.R')
Alternatively, you can run the script from directly within bash:
$ Rscript build_package.R
Unit tests have been written to ensure that the processed data and packages are behaving correctly. Run these tests by running the following in an R session from within the maragra
directory:
setwd('tests') source('testthat.R')
All analysis is written in the .Rmd
format. Analysis files are in the inst/rmd
folder. These can be knitted directly using rmarkdown::render
, or called via specific functions. For example, to compile the "Maragra clinical malaria incidence" report, simply run the following from within an R session:
maragra::generate_maragra_clinical_malaria_incidence()
All functions which begin with the term generate
will produce an output (either pdf or html). These functions also all share arguments for date
, output_dir
, and output_file
. So, one could specify that the output an analysis should go to the outputs
folder and be named "incidence_report.hml" as follows:
maragra::generate_maragra_clinical_malaria_incidence(date = Sys.Date(), output_dir = 'outputs', output_file = 'incidence_report.html')
To run all analyses, and produce all reports and papers to the outputs
folder, once can simply run the following from an R session within the maragra
folder:
maragra::run_all_analyses(output_dir = 'outputs')
This research is for Joe Brew's PhD. Full details at www.economicsofmalaria.com He is generously funded by the Erasmus Mundus Joint Doctorate Fellowship, Specific Grant Agreement 2016-1346. His program of study is the Transdisciplinary Global Health programme.
A debt of gratitude is owed to the following people and institutions, without whom this research would not be possible:
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.