knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
library(eefAnalytics2020) library(lme4) library(rstanarm)
knitr::opts_chunk$set(echo = TRUE)
The aim of educational trials is to know whether a particular education intervention is effective or not. Several researchers have recommended reporting effect sizes and their confidence intervals to understand the effecitveness of intervention from educational trials. Hedge [-@hedges2007effect] has provided the way of calculating the variability of effect size for simple and cluster randomised trials. Building to the Hedge's work, [@akansha2020] have derived the formula to obtain the variability of Hedge'g effect size for multisite designed studies. eefanalytics package provides seperate function for trials with different study design.These function uses different statistical models according to the study design and effect size is calculated using the variance and intervention coeficient from the model and variance of effect size is accordingly estimated. There are srtFREQ and srtBayes functions for simple randomised trials, crtFREQ for cluster randomised trials and mstFREQ for multisite trials.
In education, some researches prefer reporting of effect size and confidence interval based on the Bayesian framework, which automatically generates the variance of effect size. This package provides an additional set of functions for them and users can easily use these functions to extract summary of parameter estimates, effect size, confidence iterval and additional posterior probabilty for effect size from the function. Although, there exist other bayesian packages that allow users to specify the argument like in lme4. However, an afterwards extraction and manipulation of results is required to calculate the effect size and other relevant information. In eefAnalytics package all sch estimated can be extracted directly using our functions. Further bayesian functions are created according to the study design. These are srtBayes for simple randomised trials, crtBayes for cluster randomised trials and mstBayes for multisite trials. To get a better understanding from the educational trial, [@germaine2020] have proposed the use of posterior probability as an alternative and more obvious way of reporting the evidence of intervention. All Bayesian functions can provide the posterior probability. Whether a Bayesian or a Frequentist approach is used, the eefAnalytics package offers a flexible way to obtain all useful information to inform the policy-makers about the effectiveness of an intervention. It is also essential to note that this package can undertake the analysis with only a continuous dependant variable while covariates can be either continuous or in other scale.
The purpose of this document is to provide a basic description of the functionality and codes which can be implemented in the eefanalytics package. This package enable users to obtain easily paramater and variance estimates, effect size and its confidence/credible intervals, and posterior probability from randomised control trials in education. Please note that this package is also appropriate for the RCT study in other social science discipline as long as the post intervention outcome is continuous.
srtFREQ function is used to analyse simple randomised control trials data using linear regression. This function is appropriate when clustering of data is absent and the sample sizes per school are equal. The model’s functional form is rather straightforward, as reflected in the equation below:
$$ y_{i}= \beta_0 + \beta_1 t_{i} + \beta_2 pret_{i} + ...+ \epsilon_{i} $$
Where,
$i$ stands for specific pupils, $\epsilon_{i} \sim N(0, \sigma^2)$.
where y_{i} is post-test results for individual pupils, t_{i} is intervention indicator with a value of 0 for control or 1 for intervention in a two-arm trial, and pret_{i} is pre-test scores. \sigma^2 and \beta_1 is used to estimate effect size Cohen’s d [@cohen2013statistical] for the OLS model calculated as below: $$\frac{\beta_1}{\sqrt\sigma^2} $$ Further using cohens d one can estimate Hedge g using a conversion as calculated using the formula below $$J(df)=1-\frac{3}{4df-1} $$ where df is the degrees of freedom. Hedge g is calculated as J(df)*d.
srtFREQ function can be used using the command below.
output1 <- srtFREQ(Posttest~Intervention+Prettest,intervention="Intervention",data=mstData)
Output from this function and other functions described later provides the estimates of the paramater for the statistical model including effect size and variance paramaters.
output1
In this package you can extract both conditional and unconditional effect size, where conditional effect size uses the variance from the same model while the unconditional estimates used the variance from the empty model with no covariates.
output1$ES
output1$Unconditional$ES
This function analyse the cluster randomised trials data using the multilevel model as specified below.
$$ y_{ij}= \beta_0 + \beta_1 t_{ij} + ...+ b1_i+\epsilon_{ij} $$ where the continuous outcome variable y_{ij} represents post-test result of student i in school j, where j = 1, 2, . . ., M and i = 1, 2, . . ., nj. M is the number of schools, nj is the number of pupils per school, and $\epsilon_{ij} ~ N(0,\sigma^2)$ is the residual variance and captures individual pupil differences in post-test results around their school means. t_{ij} is the intervention variable for student i in school j. $\b1_{i} ~ N(0,\sigma^2_0)$ captures the variation between schools.
Effect size is calculated using the $\beta_1$ as the intervention coefficient in the numerator and the within/total variance in the denominator. Total variance is the sum of within ($\sigma^2$) and between ($\sigma^2_0$) variance and . Further hedge g effect size and their confidence interval is estimated using the formula provided by the Hedge [-@hedges2007effect]. Output from crtFREQ function is extracted as:
output2 <- crtFREQ(Posttest~Intervention+Prettest,random="School",intervention="Intervention",data=crtData)
Similar to srtFREQ, output from crtFREQ provides estimate for all the parameters specified in the multilevel model above including beta coeffcient, variance, (un)conditional effect size and random effect component since this function uses a multilevel model.
output2$ES
output2$Unconditional$ES
Further SchEffects can extract random intercepts for clusters, e.g schools
output2$SchEffects
This function fit a model with possible predictors and intervention by school interaction in random part. The model depends on the number of arms intervention has. The mathematical formulation for a model with two arms intervention is written as follows.
$$ y_{ij}= \beta_0 + \beta_1 t_{ij} + ...+ b1_i+ b2_i*t_{ij}+ \epsilon_{ij} $$ Where,
$i$ stands for specific cluster and $j$ pupils, $\epsilon_{ij} \sim N(0, \sigma^2)$.
$$ \left(\begin{array}{c} b1 \ b2 \ \end{array}\right) \sim N \left(\begin{array}{cc} \sigma^2_0 & \sigma_{01}\ \sigma_{10} & \sigma^2_1\ \end{array}\right) $$
$$\sigma^2_T = \frac{N(\sigma^2+ \sigma_{0}^2)+N_{T1}(\sigma^2_1+2\sigma_{01})}{N} $$ Using $\sigma^2_T$ and $\beta_1$ effect size can be estimated. Further variance of the effect size is estimated using the method proposed recently [@akansha2020]. mstFREQ has used this method to obtain the the variance of the effect size and variance parameters.
output3 <- mstFREQ(Posttest~Intervention+Prettest,random="School",intervention="Intervention",data=mstData) output3
From the summary of the output3 it is evident that the effect size, variance and covariance parameter can be extracted * Conditional effect size
output3$ES
SchEffects can be used to extract random intercepts for clusters (schools) and slope coefficients
output3$SchEffects
srtFREQ, crtFREQ and mstFREQ functions also provides an option for the nPerm and nBoot argument. nPerm can be used to specify the number of permutations required to generate a permutated p-value. This option can generated the permutated values. Further nBoot argument is the number of bootstraps required to generate bootstrap confidence intervals. This argument can be used to calculate confidence interval based on bootstrapping procedure. Please note that you can use either nPerm or nBoot option but not both at the same time.
We are providing few example codes and the results below.
outputb <- srtFREQ(Posttest~Intervention+Prettest,intervention="Intervention", nBoot=1000, data=mstData)
outputb$Bootstrap
outputp <- crtFREQ(Posttest~Intervention+Prettest,random="School",intervention="Intervention", nPerm=1000, data=crtData)
outputp$Perm
This is a new function to perform OLS (linear regression) analysis using the Bayesian framework with Stan. This function will mainly produce similar results as srtFREQ function as it uses vague priors. It uses stan_glm function of rstanarm R package to fit a model. To ensure the reproducibility of the results, it utilises seed = 1234.
Let explore how the srtBayes function works based on the analysis of data provided in the package. As the treatment variable is one of the main argument of this function. It is essential to check that this function can accommodate treatment variable with two arms or more.
As in stan, this function shows the number of iterarions (nsim) used for each of the four chains. As shown below, it also produces results with a warning if more iterations are needed. In other words, if the model has not converged.
thd=c(0.1,0.2) output4 <-srtBayes(Posttest~Intervention+Prettest,intervention="Intervention",adaptD=NULL,nsim=20,data=mstData,threshold=thd)
The following codes show the analysis and results of two arms intervention using a model with and without school variable in the set of covariates.
a) For a model without school predictor, other covariates are pretest and treatment.
````r
nsim=2000 output4 <- srtBayes(Posttest~Intervention+Prettest,intervention="Intervention",adaptD=NULL,nsim=nsim,data=mstData,threshold=thd) names(output4)
```r #conditional effect size output4$ES
The following codes shows an example of output4 with the unconditional outputs
````r
names(output4$unconditional)
```r #unconditional effect size output4$unconditional$ES
b) This model considers all the covariate in output4 plus the school predictor.
# Two treatment arms: with school predictor nsim=2000 output5 <-srtBayes(Posttest~Intervention+Prettest+ factor(School),intervention="Intervention", adaptD=NULL,nsim=nsim,data=mstData,threshold=thd)
Estimates of the conditional and unconditional effect size can be estimated in the similar way as we have seen in the previous three functions.
# Conditional ES output5$ES
# Unconditional ES output5$unconditional$ES
crtBayes and mstBayes functions runs the Bayesian analysis using Stan with seed=1234. This requires to specify the following arguments: formula, data, intervention, random, threshold, number of iteration (nsim), and adapt delta (adaptD). It can produce both (un)conditional effect size as well as between and total effect sizes. If necessary, further argument of stan_lmer function can also be used.
With the Bayesian approach, each parameter of the model was assigned a prior. Thus Bayesian MLM combines all those parameters with their priors are summarised in three-level given by:
$$\text{ Level 1:} \quad y_{i,j}|\beta, b_i,{\sigma}^2 \sim N({{X}^T}{i,j} \beta +{{Z}^T}{i,j}b_i,{\sigma}^2) \quad \text{for} \quad j=1,...,m_i; i=1,...,N$$ $$ \text{ Level 2:} \quad b_i|G \sim N(0,G) \quad \text{for} \quad i=1...N$$
$$ \text{ Level 3:} \quad {\sigma_i}^2 \sim p({\sigma_i}^2), \quad \beta \sim p(\beta), \quad and \quad G \sim p(G)$$
The joint posterior distribution for the Bayesian MLM is then given by:
$$P(\beta,G,\sigma^2,b_1,...b_n \mid y_1,...y_n) \propto \prod_{i=1}^{n} \prod_{j=1}^{mi} P(Post_{i,j}|b_i,\sigma^2, \beta) \prod_{i=1}^{n} P(b_i|G)P(\beta)P(G)P(\sigma^2) $$
The ES estimate and its credible interval were obtained directly from the posterior distribution. The mathematical expression of ES calculation at each iteration is given by:
$$ ES= P \left(\frac{\beta_2}{\sqrt{{\sigma_T}^2}} \mid data \right)$$
Where \sigma_T}^2 is estimated similar to the method proposed in crtFREQ and mstFREQ function. In addition to the effect size variance and covriance parameters crtbayes and mstBayes also provides the posterior probabilities estimates for a defined threshold.
Posterior probabilities can be estimated using
$$P(\omega_k\mid \sigma^2,G,\beta,b ,y )= \frac {\sum_{i=1}^M I(ES^{(i)}>\phi_k)}{M} \text{ for } k=0.0, 0.1, 0.2, ...,1.0 $$ Where M is the length of iterations used
In this part, we explored the crtBayes function using the data provided in the package. As shown in the srtBayes part above, crtBayes function also displays the number of iterations (nsim) used for each of the four chains and warning if any (for Example of the use of srtBayes section).
To test the use of our function in case of a model with more than one treatment and pretest, we presented results from the model with two arms and three arms. Intervention is a two arm intervention variable and Intervention2 is three arm intervention variable. The codes/results are presented below.
````r
thd=c(0.1,0.2) nsim <- 2000 output6 <- crtBayes(Posttest~Prettest+Intervention,intervention="Intervention", random="School",adaptD=NULL,nsim=nsim,data=crtData,threshold=thd)
a) For conditional effect size from a model with one treatment and one control group. ```r output6$ES
The following codes show all possible unconditional output that the crtBayes can provide (including unconditional effect size).
````r
names(output6$unconditional)
```r #unconditional effect size output6$unconditional$ES
b) For conditional effect size from a model with two treatment and one control group.
thd=0.1 nsim <- 2000 output7 <- crtBayes(Posttest~Prettest+Intervention2,intervention="Intervention2", random="School",adaptD=NULL,nsim=nsim,data=crtData,threshold=thd) output7$ES
The following codes show all possible unconditional output that the crtBayes can provide.
````r
output7$unconditional
```r #unconditinal effect size output7$unconditional$ES
Posterior probability for a specific threshold can be extracted using the command below.
#Posterior probability output7$ProbES
In this part, we explored the mstBayes function using the data applied in the example of srtBayes. We used a threshold (thd=c(0.1,0.2)) and the dataset used in the srtBayes part that contains variable Intervention, for two arms and Intervention2 for three arms intervention variables. As shown in the srtBayes part above, this function displays the number of iterations (nsim) used for each of the four chains and warning if any (for Example of the use of srtBayes section).
To test the use of our function in case of a model with more than one treatment (two arms intervention) and pretest. We considered Intervention2 as an additional covariate. The codes/results for two arms and three arms trials are presented below.
````r
thd=c(0.1,0.2) nsim <- 2000 output8 <- mstBayes(Posttest~Prettest+Intervention,intervention="Intervention", random="School",adaptD=NULL,nsim=nsim,data=mstData,threshold=thd)
a) For conditional effect size from a model mentioned in the output8. ```r output8$ES
The following codes show all possible unconditional output that the mstBayes can provide (including unconditional effect size).
````r
output8$unconditional
```r #Unconditional effect size output8$unconditional$ES
b) This model considers pretest and intervention with three arms as the covariates.
output9 <-mstBayes(Posttest~Prettest+Intervention2,intervention="Intervention2", random="School", adaptD=NULL,nsim=nsim,data=mstData,threshold=thd)
#conditional effect size output9$ES
#unconditional effect size output9$unconditional$ES
The eefAnalytics package enables R users to easily obtain Hedge'g effect sizes and their confidence or credible intervals with options for different study design using frequentist and Bayesian modelling approach. It performs the analysis of the simple randomised trials through a simple linear regression model, as well as cluster randomised and multisite designed trials via suitable multilevel models. The main contribution of this software package is the way it models the variability around the effect size based on frequentist aproach and facilitates obtaining Bayesian, permutation and bootstrap results.This article provides an overview of the modelling process used as the basis of the eefAnalytics and demonstrates its application with the examples.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.