knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
library(PaluckMetaSOP)

1. Introduction: the big picture

This vignette is about translating a study's results into meta-analyzable terms: converting its statistics into a standardized mean difference and variance.

The introduction places that procedure in in the broader flow of a typical meta-analysis, explains why we need such a step, and . If you're already familiar with Cohen's d/Glass's ∆ and why you might need them, you can skip to the next section.

1.1 A meta-analysis is a six step process.

1) Identify a causal question of the form: "what is the effect of X on Y?" where combining results from many papers will be useful.

1)  For instance, you might want to assess scope condition under which an effect can be expected; identify cross-study moderators that predict population- or setting-level effects that can be expected to strengthen or weaken effects; or put multiple theoretical approaches head to head to see which one works best.

2)  The alternative to a meta-analysis, for assessing the relationship between X and Y, is to run an experiment (or otherwise well-identified study) on that relationship; but a meta-analysis can also be very helpful in figuring out what that experiment should look like.

2) Decide which papers shed light on that question.

1)  In practice this means operationalizing X and Y, which might not be straightforward: does imagined contact count as contact? What exactly is prejudice, and how do we measure it?

2)  It also means setting inclusion criteria related to, e.g., internal validity, measurement error, or population. For instance, your meta-analysis might be focused only on the results of [policy-relevant randomized controlled trials](https://www.cambridge.org/core/journals/behavioural-public-policy/article/contact-hypothesis-reevaluated/142C913E7FA9E121277B29E994124EC5), a [particular class of dependent variable](https://osf.io/a7g95/), or the effects of [sexual violence prevention programs on college campuses](https://www.jahonline.org/article/S1054-139X(23)00117-9/abstract).

3) Gather and read all such papers (re: conduct a systematic search).

4) Condense each study's results to a single point estimate, or a cluster of point estimates, and associated variance(s).

5) Aggregate those estimates into a pooled average effect and any relevant subgroup effects.

6) Write up your results.

1.2 The problem of heterogeneous measurement strategies

You might be wondering: why is a translation step necessary at all? Why can't you just record the results that the papers present? In some contexts you might be able to, particularly in medicine. If you're studying, say, the effects of bed nets on malaria, it's possible that all studies in the sample will present key results on the exact same scales, and in ways that any reader can make sense of: malaria cases, severe malaria cases, deaths, etc. In that situation, it probably does make sense to just record exactly what the authors do, and then meta-analyze that.

This is not the case in the Paluck Lab's meta-analyses. Our outcomes, such as prejudice, don't typically have a universally-agreed on measurement strategy. But we still need to integrate results from many papers to come up with a pooled average effect. Moreover, statistical results come in many varieties, and we also need a way to integrate, say, a t-test to a difference in means or an odds ratio.

1.5 The next piece of the puzzle: variance (and standard error)

Once all your studies are expressed as ∆ – number of standard deviations of change on the dependent variable — it's straightforward to compare and combine them. The largest number is, on paper, the most effective treatment, and the average of the two numbers should, if the studies are equally well powered, be a better estimate of the true population effect than either study on its own. However, that caveat about statistical power is important. If you have a study with 50 subjects that finds an effect size of ∆ = 0.5, and another study with 500 subjects that finds ∆ = 0.1, intuitively you want to give more weight to the larger study because, by the law of large numbers, the sample mean converges towards the population mean as sample size increases.

The variance and standard error of your ∆ estimate help you get there. For each Delta, we calculate

$$ var_d = \left(\frac{n_t + n_c}{n_t \cdot n_c}\right) + \left(\frac{d^2}{2 \cdot (n_t + n_c)}\right) $$

Where var_d is the variance of ∆ (written as 'd' in equations and functions); n_t is the sample size of the treatment group; n_c is the sample size of the treatment group; and d is the effect size.

We then apply what's called Hedge' g correction factor, which is intended to correct for bias in small samples by downweighing estimates from small studies.

The correction is as follows (taken from the var_d_calc.R function):

$$ g = 1 - \left(\frac{3}{4 \cdot (n_t + n_c - 2) - 1}\right) $$

The variance of ∆ comes out to $var_d * g^2$. The standard error of ∆ is the square root of variance. By convention, we include both columns in our meta-analytic datasets. Theoretically one or the other would suffice, but some functions ask for var_d while some ask for se_d, and

In practice, the Hedge's g correction increases the stated variance of a small study, e.g. one with 25 subjects per arm, and does not meaningfully alter the variance of a larger study, e.g. one with 250 subjects per arm. You can build an intuition for this this by altering var_d_calc to omit the hedge's g correction (or just comment it out and instead set g to 1) amd putting different values for d, n_t, and n_c.

1.6 ∆ and variance of ∆ are necessary for meta-analysis.

Step 5 – combining and comparing studies – requires a standardized mean difference estimate and its associated variance. Any pooled estimates make use of both pieces of information, as do most of the graphs you might want to share with your readers and many of your potential checks for publication bias.

Your meta-analytic dataset will (ideally) have one column for d/∆, one for variance, and one for standard error. Most meta-analytic functions, including those in this package, will expect your dataset to be structured this way.



setgree/PrejMetaFunctions documentation built on April 28, 2024, 9:29 a.m.