hJAM Vignette

library(hJAM)

Overview

hJAM is a package developed to implement the hJAM model, which is designed to estimate the associations between multiple intermediates and outcome in a Mendelian Randomization or Transcriptome analysis.

Mendelian randomization (MR) and transcriptome-wide association studies (TWAS) can be viewed as the same approach within the instrumental variable analysis framework using genetic variants. They differ in their intermediates: MR focuses on modifible risk factors while TWAS focuses on gene expressions. We can use a two-stage hierarchical model to unify the framework of MR and TWAS. Details are described in our paper.

Implementation

We have two methods in our pacakge:

General input

The input of the two hJAM model includes:

Get conditional $\hat{A}$ matrix

To generate a conditional estimate $\hat{A}$ matrix from a marginal estimate $\hat{A}$ matrix, users can use the get_cond_A (if number of intermediates > 1) or get_cond_alpha (if number of intermediate = 1) functions in hJAM package. Examples are given in next section.

For MR questions, the intermediates are modifiable risk factors. The marginal $\hat{A}$ can be extracted from different GWAS whose the outcomes are the risk factors of interests. For example, for intermediate as body mass index, the marginal $\hat{\alpha}$ vector can be extracted from the GIANT consortium.

For TWAS questions, the intermediates are gene expressions. There are two ways to obtain the elements in $\hat{A}$ matrix.

  1. GTEx portal: the GTEx project provides marginal summary statistics between SNPs and gene expressions in different tissues.

  2. PredictDB: the PredictDB is developed by the PrediXcan group. It uses elastic net on individual level data from the GTEx project.


Implementation with caution:


Example

In our package, we prepared a data example which we have described in detail in our paper. In this data example, we focus on the conditional effects of body mass index (BMI) and type 2 diabetes (T2D) on myocardial infarction (MI).

We identified 75 and 136 significantly BMI- and T2D-associated SNPs from GIANT consortium and DIAGRAM+GERA+UKB, respectively. In this set of SNPs, there was one overlapping SNP in both the instrument sets for BMI and T2D. In total, we have 210 SNPs identified. The association estimates between the 210 SNPs and MI were collected from UK Biobank.

Data exploration

A quick look at the data in the example -

data("conditional_A")
data("betas.Gy")
data("SNPs_info")
conditional_A[1:10, ]
betas.Gy[1:10]
SNPs_info[1:10, ]

In this package, we embed two fucntions for the users to check the SNPs they use in the analysis visually:

scatter_plot_p = SNPs_scatter_plot(A = conditional_A, betas.Gy = betas.Gy, num_X = 2)
heatmap_p = SNPs_heatmap(Gl)

Conversion of $\hat{A}$ matrix

You could use function get_cond_A function to run JAM on the marginal estimates $\hat{A}$ matrix and convert it into a conditional estimates $\hat{A}$ matrix.

data("marginal_A")
cond_A = get_cond_A(marginal_A = marginal_A, Gl = Gl, N.Gx = 339224, ridgeTerm = T)
cond_A[1:10, ]

hJAM

The default version of hJAM restricts the intercept to be zero.

hJAM::hJAM_lnreg(betas.Gy = betas.Gy, N.Gy = 459324, A = conditional_A, Gl = Gl, ridgeTerm = TRUE) # 459324 is the sample size of the UK Biobank GWAS of MI

Another method in this package is hJAM with Egger regression, which is analogus to MR egger. It allows the intercept to be non-zero.

hJAM::hJAM_egger(betas.Gy = betas.Gy, N.Gy = 459324, A = conditional_A, Gl = Gl, ridgeTerm = TRUE) # 459324 is the sample size of the UK Biobank GWAS of MI

Conclusion

We presented the main usage of hJAM package. For more details about each function, please go check the package documentation. If you would like to give us feedback or report issue, please tell us on Github.



Try the hJAM package in your browser

Any scripts or data that you put into this service are public.

hJAM documentation built on March 26, 2020, 8:13 p.m.