getmstatistic: Quantifying Systematic Heterogeneity in Meta-Analysis.

Description Usage Arguments Details Value Methods (by class) See Also Examples

View source: R/driver_compute_mstatistics.R

Description

getmstatistic computes M statistics to assess the contribution of each participating study in a meta-analysis. The M statistic aggregates heterogeneity information across multiple variants to, identify systematic heterogeneity patterns and their direction of effect in meta-analysis. It's primary use is to identify outlier studies, which either show "null" effects or consistently show stronger or weaker genetic effects than average, across the panel of variants examined in a GWAS meta-analysis.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
getmstatistic(beta_in, lambda_se_in, study_names_in, variant_names_in, ...)

## Default S3 method:
getmstatistic(
  beta_in,
  lambda_se_in,
  study_names_in,
  variant_names_in,
  save_dir = getwd(),
  tau2_method = "DL",
  x_axis_increment_in = 0.02,
  x_axis_round_in = 2,
  produce_plots = TRUE,
  verbose_output = FALSE,
  ...
)

Arguments

beta_in

A numeric vector of study effect-sizes e.g. log odds-ratios.

lambda_se_in

A numeric vector of standard errors, genomically corrected at study-level.

study_names_in

A character vector of study names.

variant_names_in

A character vector of variant names e.g. rsIDs.

...

Further arguments.

save_dir

A character scalar specifying a path to the directory where plots should be stored (optional). Required if produce_plots = TRUE.

tau2_method

A character scalar, method to estimate heterogeneity: either "DL" or "REML" (Optional). Note: The REML method uses the iterative Fisher scoring algorithm (step length = 0.5, maximum iterations = 10000) to estimate tau2.

x_axis_increment_in

A numeric scalar, value by which x-axis of M scatterplot will be incremented (Optional).

x_axis_round_in

A numeric scalar, value to which x-axis labels of M scatterplot will be rounded (Optional).

produce_plots

A boolean to generate plots (optional).

verbose_output

An optional boolean to display intermediate output.

Details

In contrast to conventional heterogeneity metrics (Q-statistic, I-squared and tau-squared) which measure random heterogeneity at individual variants, M measures systematic (non-random) heterogeneity across multiple independently associated variants.

Systematic heterogeneity can arise in a meta-analysis due to differences in the study characteristics of participating studies. Some of the differences may include: ancestry, allele frequencies, phenotype definition, age-of-disease onset, family-history, gender, linkage disequilibrium and quality control thresholds. See the getmstatistic website for statistical theory, documentation and examples.

getmstatistic uses summary data i.e. study effect-sizes and their corresponding standard errors to calculate M statistics (One M for each study in the meta-analysis).

In particular, getmstatistic employs the inverse-variance weighted random effects regression model provided in the metafor R package to extract SPREs (standardized predicted random effects) which are then aggregated to formulate M statistics.

Value

Returns a list containing:

Methods (by class)

See Also

rma.uni function in metafor for random effects model, and https://magosil86.github.io/getmstatistic/ for getmstatistic website.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
library(getmstatistic)
library(gridExtra)


# Basic M analysis using the heartgenes214 dataset.
# heartgenes214 is a multi-ethnic GWAS meta-analysis dataset for coronary artery disease.
# To learn more about the heartgenes214 dataset ?heartgenes214

# Running an M analysis on 20 GWAS significant variants (p < 5e-08) in the first 10 studies

heartgenes44_10studies <- subset(heartgenes214, studies <= 10 & fdr214_gwas46 == 2) 
heartgenes20_10studies <- subset(heartgenes44_10studies, 
    variants %in% unique(heartgenes44_10studies$variants)[1:20])

# Set directory to store plots, this can be a temporary directory
# or a path to a directory of choice e.g. plots_dir <- "~/Downloads"
plots_dir <- tempdir()

getmstatistic_results <- getmstatistic(heartgenes20_10studies$beta_flipped, 
                                        heartgenes20_10studies$gcse, 
                                        heartgenes20_10studies$variants, 
                                        heartgenes20_10studies$studies,
                                        save_dir = plots_dir)
getmstatistic_results

# Explore results generated by getmstatistic function

# Retrieve dataset of M statistics
dframe <- getmstatistic_results$M_dataset



str(dframe)


# Retrieve dataset of stronger than average studies (significant at 5% level)
getmstatistic_results$influential_studies_0_05

# Retrieve dataset of weaker than average studies (significant at 5% level)
getmstatistic_results$weaker_studies_0_05

# Retrieve number of studies and variants
getmstatistic_results$number_studies
getmstatistic_results$number_variants

# Retrieve expected mean, sd and critical M value at 5% significance level
getmstatistic_results$M_expected_mean
getmstatistic_results$M_expected_sd
getmstatistic_results$M_crit_alpha_0_05

# To view plots stored in a temporary directory, call `tempdir()` to view the directory path 
tempdir()


# Additional examples: These take a little bit longer to run

## Not run: 

# Set directory to store plots, this can be a temporary directory
# or a path to a directory of choice e.g. plots_dir <- "~/Downloads"
plots_dir <- tempdir()

# Run M analysis on all 214 lead variants
# heartgenes214 is a multi-ethnic GWAS meta-analysis dataset for coronary artery disease.
getmstatistic_results <- getmstatistic(heartgenes214$beta_flipped, 
                                        heartgenes214$gcse, 
                                        heartgenes214$variants, 
                                        heartgenes214$studies,
                                        save_dir = plots_dir)
getmstatistic_results


# Subset the GWAS significant variants (p < 5e-08) in heartgenes214
heartgenes44 <- subset(heartgenes214, heartgenes214$fdr214_gwas46 == 2)

# Exploring getmstatistic options:
#     Estimate heterogeneity using "REML", default is "DL"
#     Modify x-axis of M scatterplot
#     Run M analysis verbosely
getmstatistic_results <- getmstatistic(heartgenes44$beta_flipped, 
                                        heartgenes44$gcse, 
                                        heartgenes44$variants, 
                                        heartgenes44$studies,
                                        save_dir = plots_dir,
                                        tau2_method = "REML",
                                        x_axis_increment_in = 0.03, 
                                        x_axis_round_in = 3,
                                        produce_plots = TRUE,
                                        verbose_output = TRUE)
getmstatistic_results



## End(Not run)

magosil86/getmstatistic documentation built on May 10, 2021, 9:47 a.m.