Using RQT, an R package for gene-level meta-analysis

Overview

Despite the recent advances of modern GWAS methods, it is still remains an important problem of addressing calculation an effect size and corresponding p-value for the whole gene rather than for single variant. We developed an R-package rqt, which offers gene-level GWAS meta-analysis. The package can be easily included into bioinformatics pipeline or used as stand-alone. Contact: ilya.zhbannikov@duke.edu for questions of usage the \texttt{rqt} or any other issues.

Below we provide several examples that show GWAS meta-analysis on gene-level layer.

Methods in brief

The workflow of gene-level meta analysis consists of the following steps: (i) reducing the number of predictors, thereby alleviating correlation problem in variants (accounting for LD); (ii) then the regression mod-el is fitted on the reduced dataset to obtain corresponding regression coefficient ("effect sizes"); (iii) these coefficients are then to be pooled into a total index representing a total gene-level effect size and corresponding statistics is calculated. P- and q- values are then calculated using this statistics from asymptotic approximation or permutation procedure; (iv) the final step is combining gene-level p-values calculated from each study with Fisher's combined probability method.

Installation of \emph{rqt} package

In order to install the \emph{rqt} package, the user must first install R (\url{http://www.r-project.org}). After that, \emph{rqt} can be installed with:

devtools::install_github("izhbannikov/rqt", buildVignette=TRUE)

Data description

Single dataset

In \texttt{rqt} requires the following datasets: (i) \texttt{phenotype} (a \texttt{N} by 1) matrix (i.e. a vector); and (ii) \texttt{genotype} - an object of class \texttt{SummarizedExperiment} containing one assay: (a \texttt{N} by \texttt{M}) matrix, where \texttt{N} - is the total number of individuals in the study and \texttt{M} is the total number of genetic variants. Optionally, \texttt{rqt} can accept covariates, in form of \texttt{N} by \texttt{K} matrix, where \texttt{K} is the total number of covariates used in the study. Phenotype can be dichotomous (0/1, where 1 indicates control and 0 case).

Meta-analysis

In meta-analysis, \texttt{rqt} requires a list of \texttt{M} (\texttt{M} - number of datasets used in meta-analysis) and optionally it accepts covariates in form described above.

Examples

Gene-level analysis on a single dataset

Dichotomous phenotype

library(rqt)

data <- data.matrix(read.table(system.file("extdata/test.bin1.dat",
                                           package="rqt"), header=TRUE))
pheno <- data[,1]
geno <- data[, 2:dim(data)[2]]
colnames(geno) <- paste(seq(1, dim(geno)[2]))
geno.obj <- SummarizedExperiment(geno)
obj <- rqt(phenotype=pheno, genotype=geno.obj)
res <- geneTest(obj, method="pca", out.type = "D")
print(res)

Continuous phenotype

library(rqt)

data <- data.matrix(read.table(system.file("extdata/test.cont1.dat",
                                           package="rqt"), header = TRUE))
pheno <- data[,1]
geno <- data[, 2:dim(data)[2]]
colnames(geno) <- paste(seq(1, dim(geno)[2]))
geno.obj <- SummarizedExperiment(geno)
obj <- rqt(phenotype=pheno, genotype=geno.obj)
res <- geneTest(obj, method="pca", out.type = "C")
print(res)

Preprocessing with Partial Least Square regression (PLS)

This method is used for continous outcome, i.e. out.type = "C".

````r library(rqt)

data <- data.matrix(read.table(system.file("extdata/test.cont1.dat", package="rqt"), header = TRUE)) pheno <- data[,1] geno <- data[, 2:dim(data)[2]] colnames(geno) <- paste(seq(1, dim(geno)[2])) geno.obj <- SummarizedExperiment(geno) obj <- rqt(phenotype=pheno, genotype=geno.obj) res <- geneTest(obj, method="pls", out.type = "C") print(res)

#### Preprocessing with Partial Least Square Discriminant Analysis (PLS-DA)

This method of data preprocessing is used for dichotomous outcome.

````r
library(rqt)

data <- data.matrix(read.table(system.file("extdata/test.bin1.dat",
                                           package="rqt"), header=TRUE))
pheno <- data[,1]
geno <- data[, 2:dim(data)[2]]
colnames(geno) <- paste(seq(1, dim(geno)[2]))
geno.obj <- SummarizedExperiment(geno)
obj <- rqt(phenotype=pheno, genotype=geno.obj)
# Not yet supported, sorry!
#res <- geneTest(obj, method="pls", out.type = "D", scale = TRUE)
print(res)

Using additional covariates

Quite often, researchers want to supply not only genetic data but also specific covariates, representic some physiological parameters or environment (for example, to evaluate hyphoteses of gene-environment interactions). In such cases, the package \texttt{rqt} can accept additional covariates, in form of \texttt{N} by \texttt{K} matrix, as provided below:

````r library(rqt)

data <- data.matrix(read.table(system.file("extdata/test.bin1.dat", package="rqt"), header = TRUE)) pheno <- data[,1] geno <- data[, 2:dim(data)[2]] colnames(geno) <- paste(seq(1, dim(geno)[2])) geno.obj <- SummarizedExperiment(geno) covars <- read.table(system.file("extdata/test.cova1.dat",package="rqt"), header=TRUE) obj <- rqt(phenotype=pheno, genotype=geno.obj, covariates = covars) res <- geneTest(obj, method="pca", out.type = "D") print(res)

For continous phenotype:

````r
library(rqt)

data <- data.matrix(read.table(system.file("extdata/test.cont1.dat",
                                           package="rqt"), header = TRUE))
pheno <- data[,1]
geno <- data[, 2:dim(data)[2]]
colnames(geno) <- paste(seq(1, dim(geno)[2]))
geno.obj <- SummarizedExperiment(geno)
covars <- read.table(system.file("extdata/test.cova1.dat",package="rqt"), 
    header=TRUE)
obj <- rqt(phenotype=pheno, genotype=geno.obj, covariates = covars)
res <- geneTest(obj, method="pca", out.type = "C")
print(res)

Meta-analysis

library(rqt)

data1 <- data.matrix(read.table(system.file("extdata/phengen2.dat",
                                            package="rqt"), skip=1))
pheno <- data1[,1]
geno <- data1[, 2:dim(data1)[2]]
colnames(geno) <- paste(seq(1, dim(geno)[2]))
geno.obj <- SummarizedExperiment(geno)
obj1 <- rqt(phenotype=pheno, genotype=geno.obj)

data2 <- data.matrix(read.table(system.file("extdata/phengen3.dat",
                                            package="rqt"), skip=1))
pheno <- data2[,1]
geno <- data2[, 2:dim(data2)[2]]
colnames(geno) <- paste(seq(1, dim(geno)[2]))
geno.obj <- SummarizedExperiment(geno)
obj2 <- rqt(phenotype=pheno, genotype=geno.obj)

data3 <- data.matrix(read.table(system.file("extdata/phengen.dat",
                                            package="rqt"), skip=1))
pheno <- data3[,1]
geno <- data3[, 2:dim(data3)[2]]
colnames(geno) <- paste(seq(1, dim(geno)[2]))
geno.obj <- SummarizedExperiment(geno)
obj3 <- rqt(phenotype=pheno, genotype=geno.obj)

res.meta <- geneTestMeta(list(obj1, obj2, obj3))
print(res.meta)

Session information

sessionInfo()


Try the rqt package in your browser

Any scripts or data that you put into this service are public.

rqt documentation built on Nov. 8, 2020, 4:50 p.m.