knitr::opts_chunk$set(eval = greta:::check_tf_version("message"), cache = TRUE) set.seed(1)
Eight Schools is a study of coaching effects from eight schools; it comes from section 5.5 of Gelman et al. (2003) as covered in 2.1. Schools data of 'R2WinBUGS: A Package for Running WinBUGS from R':
The Scholastic Aptitude Test (SAT) measures the aptitude of high-schoolers in order to help colleges to make admissions decisions. It is divided into two parts, verbal (SAT-V) and mathematical (SAT-M). Our data comes from the SAT-V (Scholastic Aptitude Test-Verbal) on eight different high schools, from an experiment conducted in the late 1970s. SAT-V is a standard multiple choice test administered by the Educational Testing Service. This Service was interested in the effects of coaching programs for each of the selected schools. The study included coached and uncoached pupils, about sixty in each of the eight different schools; see Rubin (1981). All of them had already taken the PSAT (Preliminary SAT) which results were used as covariates. For each school, the estimated treatment effect and the standard error of the effect estimate are given. These are calculated by an analysis of covariance adjustment appropriate for a completely randomized experiment (Rubin 1981). This example was analysed using a hierarchical normal model in Rubin (1981) and Gelman, Carlin, Stern, and Rubin (2003, Section 5.5).
The corresponding TensorFlow Probability Jupyter notebook can be found here.
library(greta) library(tidyverse) library(bayesplot) color_scheme_set("purple")
# data N <- letters[1:8] treatment_effects <- c(28.39, 7.94, -2.75 , 6.82, -0.64, 0.63, 18.01, 12.16) treatment_stddevs <- c(14.9, 10.2, 16.3, 11.0, 9.4, 11.4, 10.4, 17.6)
schools <- data.frame(N = N, treatment_effects = treatment_effects, treatment_stddevs = treatment_stddevs) %>% mutate(treatment_effects_p_stddevs = treatment_effects + treatment_stddevs, treatment_effects_m_stddevs = treatment_effects - treatment_stddevs)
For each the eight schools N
we have the estimated treatment effect (treatment_effects
) plus standard error (treatment_stddevs
).
Below, we are replicating the barplot from the TensorFlow Probability example that shows the estimated treatment effects +/- standard error per school:
ggplot(schools, aes(x = N, y = treatment_effects)) + geom_bar(stat = "identity", fill = "purple", alpha = 0.5) + geom_errorbar(aes(ymin = treatment_effects_m_stddevs, ymax = treatment_effects_p_stddevs), width = 0.3) + labs(x = "school", y = "treatment effect", title = "Barplot of treatment effects for eight schools", subtitle = "Error bars represent standard error")
A different way to plot the estimated effects and their standard errors is to plot the density distribution over the eight schools we have:
schools %>% gather(x, y, treatment_effects, treatment_effects_p_stddevs, treatment_effects_m_stddevs) %>% ggplot(aes(x = y, color = x)) + geom_density(fill = "purple", alpha = 0.5) + scale_color_brewer(palette = "Set1") + labs(x = "treatment effect (+/- standard error)", color = "density curve of", title = "Density plot of treatment effects +/- standard error for eight schools")
greta
To model the data, we use the same hierarchical normal model as in the TensorFlow Probability example.
First, we create greta arrays that represent the variables and prior distributions in our model and create a greta array for school effect from them. We define the following (random) variables and priors:
avg_effect
: normal density function (dnorm
) with a mean of 0
and standard deviation of 10
; represents the prior average treatment effect.avg_effect <- normal(mean = 0, sd = 10) avg_effect
avg_stddev
: normal density function (dnorm
) with a mean of 5
and standard deviation of 1
; controls the amount of variance between schools.avg_stddev <- normal(5, 1) avg_stddev
school_effects_standard
: normal density function (dnorm
) with a mean of 0
, standard deviation of 1
and dimension of 8
school_effects_standard <- normal(0, 1, dim = length(N)) school_effects_standard
school_effects
: here we multiply the exponential of avg_stddev
with school_effects_standard
and add avg_effect
school_effects <- avg_effect + exp(avg_stddev) * school_effects_standard school_effects
An alternative would be to directly use the lognormal()
density function for avg_stddev
and use that to calculate school_effect
:
avg_stddev <- lognormal(5, 1) school_effects <- avg_effect + avg_stddev * school_effects_standard
Next, we want to link the variables and priors with the observed dependent data - in this case the school estimate treatment_effects
.
We define the likelihood over our observed estimates treatment_effects
given a random sample from the normal probability distribution with mean school_effects
and standard deviation treatment_stddevs
. From this, we would now like to calculate the parameter of that probability distribution by using the distribution()
function:
distribution(treatment_effects) <- normal(school_effects, treatment_stddevs)
Now we have all the prerequisites for building a Hamiltonian Monte Carlo (HMC) to calculate the posterior distribution over the model's parameters.
We first define the model by combining the calculated avg_effect
, avg_stddev
and school_effects_standard
variables so that we can sample from them during modelling. The model m
we define below contains all our prior distributions and thus represent the combined density of the model.
It is recommended that you check your model at this step by plotting the model graph. More information about these plots can be found here.
# defining the hierarchical model m <- model(avg_effect, avg_stddev, school_effects_standard) m
plot(m)
The actual sampling from the model happens with the mcmc()
function. By default 1000 MCMC samples are drawn after warm-up.
What we obtain is a probability measure that describes the likelihood of a set of randomly sampled values for the model variables.
# sampling draws <- greta::mcmc(m, n_samples = 1000, warmup = 1000, chains = 4)
summary(draws)
mcmc_trace(draws, facet_args = list(ncol = 3))
mcmc_intervals(draws)
mcmc_acf_bar(draws)
mcmc_hist(draws, facet_args = list(ncol = 3))
calculate()
for transforming estimates to natural scaleThe calculate()
function can be used with the transformation function
used in building the model to get the school-specific posteriors chains. This
function is also how you would get posterior predictive values.
# Calculate school effects on original scale school_effects <- avg_effect + avg_stddev * school_effects_standard posterior_school_effects <- calculate(school_effects, values = draws)
As a sanity check that we parameterized our model correctly, we can compare the back-transformed school-specific estimates to the results from the Edward2 approach in the TensorFlow Probability documentation. The results are very similar.
# Posterior means via Edward2 edward2_school_means <- data.frame(tool = "Edward2", school = N, #mean_school_effects_standard = c(0.61157268, 0.06430732, -0.25459746, # 0.04828103, -0.36940941, -0.23154463, # 0.49402338, 0.13042814), mean = c(14.93237686, 7.50939941, 3.07602358, 7.21652555, 2.0329783, 3.41213799, 12.92509365, 8.36702347), sd = 0) edward2_pop_mean <- data.frame(tool = "Edward2", 'mean' = 6.48866844177, 'sd' = 0) # hmc_mean_avg_stddev <- 2.46163249016 posterior_school_effects <- as.data.frame(as.matrix(posterior_school_effects)) # Relabel school measures colnames(posterior_school_effects) <- N # Summarise and combine all chains of interest for plotting posterior_summaries <- posterior_school_effects %>% gather(key = school, value = value) %>% group_by(school) %>% summarise_all(funs(mean, sd)) school_summaries <- posterior_summaries %>% mutate(tool = "greta") %>% rbind(edward2_school_means) population_parameters <- as.data.frame(as.matrix(draws)) %>% select(avg_effect) %>% summarise_all(funs(mean, sd)) %>% mutate(tool = "greta") %>% rbind(edward2_pop_mean) ggplot(school_summaries, aes(x = school, y = mean, color = tool, shape = tool)) + geom_errorbar(aes(ymin = mean - sd, ymax = mean + sd), width = 0.2) + geom_point() + geom_hline(data = population_parameters, aes(yintercept = mean, linetype = 'Population mean', color = tool)) + scale_linetype_manual(name = "", values = c(2, 2))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.