# csl: journal-of-health-economics.csl # Packages library(tidyverse) library(knitr) library(Hmisc) library(brew) library(maragra) library(RColorBrewer) library(extrafont) library(kableExtra) library(raster) loadfonts() ## Global options options(max.print="75") opts_chunk$set(echo=FALSE, cache=FALSE, prompt=FALSE, fig.height = 4, fig.width = 5, tidy=TRUE, comment=NA, message=FALSE, warning=FALSE, dev = "cairo_pdf", fig.pos="!h", fig.align = 'center') opts_knit$set(width=75) options(xtable.comment = FALSE)
source('prepare_data.R')
\begin{center} \begin{large}
Maragra models
\end{large} \end{center}
\vspace{5mm}
\begin{center}
\textbf{Overview}
\end{center}
\vspace{5mm} \begin{center} \begin{changemargin}{3cm}{3cm}
This document contains methods and basic results for a few different modeling approaches. It was produced in September 2018 in Amsterdam. The purpose of this document is to give an overview of the modeling approached employed so far, as well as provide a summary of results.
\end{changemargin} \end{center}
\vspace{20mm}
\noindent\fbox{% \parbox{\textwidth}{% \subsection*{Main points} \begin{itemize} \item We have combined individual and indirect protection into one index \item We have simplified our model approach \item Results are consistent/coherent \end{itemize} \vspace{2mm} }% }
\vfill \null
\subsection*{Desinataires} \textbf{Elisa Sicuri; Menno Pradhan}
\vspace{3mm}
\newpage
In the draft of the paper (\href{https://docs.google.com/document/d/1bUWRBCgVcgjSPHchIQxiTG8Vwv5hV1GLU4Tlu386sWA/edit#}{HERE} ), the model formula is:
$$ \hat{Y_{it}} = \hat{\beta}{0} + \hat{\beta}{1}\text{Season}{t} * IRS{it} + \hat{\beta}2{RainyDay{t}} + \hat{\beta}3{Herd{it}} + \alpha_i + \delta_t + \upsilon_{it} $$
In R, using the felm
library, this is written as follows:
protection_models[[1]]$call
The above notation shows absent
(binary 1,0) as a function of the interaction of binary season
(high vs. low) and binary months_since
(before vs. after) plus binary rainy_day
(whether or not it rained on the particular day) plus continuous numeric herd
(ie, the protection afforded by others) plus fixed effects for both the worker (oracle_number
) and the year (malaria_year
). $\alpha_i$ represents the time invariant worker fixed effects, and $\delta_i$ represents the fixed effect of the particular malaria season. $\upsilon$ is the error term. This model is run separately for the 4 different worker groups, yielding the following results:
make_models_table(model_list = protection_models, the_caption = "All absenteeism with herd immunity: model results", type = 'latex', multiplier = 1)
In the above, "after" IRS application (ie, the 6 months following IRS) is associated with a significant reduction in absenteeism for both kinds of permanent workers, and does not have a signficiant association with a change in absenteeism for both kinds of temporary workers.
The new approach, discussed September 26 2018, is characterized by the following:
$$ \hat{Y_{it}} = \hat{\beta}{0} + \hat{\beta}{1}\text{exp(protection)}{it} + \hat{\beta}{2}\text{Rain}{it} + \alpha_i + \upsilon{it} $$
The above notation shows absent
(binary 1,0) as a function of the exponentiation of the protection
variable (the summation of individual plus indirectly conferred protection) and the rain
variable (the summed lag of rainfall from 4-13 weeks). $\alpha_i$ represents the time invariant worker fixed effects, $\upsilon$ is the error term.
In R, our new model looks like this
final_model$call
Where protection
is the sum of all nearby IRS, weighted by the inverse of the distance from the residence of the person (including the person's own residence, set at a distance of 10 meters).
head(monthly)
library(cowplot) ggplot(data = monthly, aes(x = absent)) + geom_histogram(fill = 'grey', color = 'black') + labs(x = 'Absenteeism rate', y = 'Count', title = 'Distribution of monthly absenteeism rate')
ggplot(data = monthly, aes(x = protection)) + geom_histogram(fill = 'grey', color = 'black') + labs(x = 'Protection score', y = 'Count', title = 'Distribution of protection score')
ggplot(data = monthly, aes(x = precipitation_lag2_3)) + geom_histogram(fill = 'grey', color = 'black') + labs(x = 'Precipitation lag', y = 'Count', title = 'Distribution of 2-3 month precipitation lag')
\newpage
The below table shows the results of the model devised thus far.
make_final_table(model = final_model, the_caption = "Model results", type = 'latex', multiplier = 1000)
The below charts show predicted absenteeism rates at different rain and protection levels.
fake <- expand.grid(#group = sort(unique(monthly$group)), precipitation_lag1_3 = seq(0, 18, by = 3), protection = seq(0,7, by = 1))#, fake$predicted <- fake$predicted_upr <- fake$predicted_lwr<- NA fe <- getfe(final_model) predictions <- exp(predict(final_model_lm, fake, interval = 'confidence'))-1 # predictions <- data.frame(predictions) # # Add the fixed effects # fixed_effects <- fe$effect fake$predicted<- predictions[,1] fake$predicted_upr <- predictions[,3] fake$predicted_lwr <- predictions[,2]
ggplot(data = fake %>% filter(protection == 1), aes(x = precipitation_lag1_3, y = predicted)) + geom_point() + geom_line() + geom_ribbon(aes(ymin = predicted_lwr, ymax = predicted_upr), alpha = 0.4, fill = 'darkorange') + labs(y = 'Predicted absenteeism rate', x = '1-3 month precipitation lag', title = 'Lagged precipitation and absenteeism')
ggplot(data = fake %>% filter(precipitation_lag1_3 == 3), aes(x = protection, y = predicted)) + geom_point() + geom_line() + geom_ribbon(aes(ymin = predicted_lwr, ymax = predicted_upr), alpha = 0.4, fill = 'darkorange') + labs(y = 'Predicted absenteeism rate', x = 'Protection index', title = 'IRS-afforded protection and absenteeism')
# MODEL WITH MENNO #fit <- lm(log(absent+1) ~ protection + precipitation_lag2_3, data = monthly) #ci <- confint(fit) # Segregate by working group # Add worker fixed effects # Time fixed effects? # Plot months and see if it's working # # fitbi <- glm(absent ~ protection + precipitation_lag2_3, data = monthly, # # family = binomial('logit')) # # Logit model # ci <- exp(confint(fitbi)) # data.frame(coefs = exp(coef(fitbi)), # lwr = ci[,1], # upr = ci[,2], # p = coef(summary(fitbi))[,4])
# Introduction is really all that matters - specifying contribution, summarizing results # Journal of Health Economics (1) and then Health Economics (2). # Elisa: American Journal of Health Economics (from American Health Economic Society) # Ensure that we explicitly explain these 21 days issue to rule out reverse causality. # Large firms are big enough to capture externalities # What is the effect of anti-malaria campaign on worker producitivity # Data special # Identification problems (no external spraying, no home spraying, unobserved govt - omitted variable bias dealt with through worker and time fixed effects.) # Exploring herd protection through various functional forms
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.