SPAGE: SaddlePoint Approximation implementation of GxE analysis

Description Usage Arguments Details Value Examples

View source: R/main.R

Description

Test for association between marginal GxE multiplicative interaction effect and dichotomous phenotypes.

Usage

1
2
3
SPAGE(obj.null, Envn.mtx, Geno.mtx, Cutoff = 2, impute.method = "none",
  missing.cutoff = 0.15, min.maf = 0, Firth.cutoff = 0,
  BetaG.cutoff = 0.001, BetaG.SPA = F, G.Model = "Add")

Arguments

obj.null

output object of function SPAGE_Null_Model.

Envn.mtx

a numeric environment matrix with each row as an individual and each column as an environmental factor. Column names of environmental factors and row names of subject IDs are required.

Geno.mtx

a numeric genotype matrix with each row as an individual and each column as a genetic variant. Column names of genetic variations and row names of subject IDs are required. Missng genotype should be coded as NA. Both hard-called and imputed genotype data are supported.

Cutoff

a numeric value (Default: 2) to specify the standard deviation cutoff to be used. If the test statistic lies within the standard deviation cutoff of the mean, its p value is calculated based on normal distribution approximation, otherwise, its p value is calculated based on saddlepoint approximation.

impute.method

a character string (default= "none") to specify the method to impute missing genotypes. "bestguess" imputes missing genotypes as most likely values (0,1,2), "random" imputes missing genotypes by generating binomial(2,p) random variables (p is the MAF), and "fixed" imputes missing genotypes by assigning the mean genotype values (2p).

missing.cutoff

a numeric value (default=0.15) to specify the cutoff of the missing rates of SNPs. Any SNP with missing rates higher than the cutoff will be excluded from the analysis.

min.maf

a numeric value (default=0) to specify the cutoff of the minimal MAF. Any SNP with MAF < cutoff will be excluded from the analysis.

Firth.cutoff

a numeric value (default=0, no Firth output) to specify the p-value cutoff for Firth's test. Only when the SPA p-value less than the cutoff, Firth's test p-value is calculated.

BetaG.cutoff

a numeric value (default=0.001) to specify the p-value cutoff for betaG estimation. See details for more information.

BetaG.SPA

a logical value (default=F) to determine p.value.BetaG is calculated based on SPA (TRUE) or a normal distribution approximation (FALSE).

G.Model

a character string (default="Add") to determine the genetic model. Options include "Add" (default, no change), "Dom" (g>=1: 1; g<1: 0) and "Rec" (g>1: 1; g<=1: 0). Be careful when dosage genotype data is used. We do not check MAF before transformation.

Details

Here we propose a scalable and accurate method, SPAGE (SaddlePoint Approximation implementation of G×E analysis), that is applicable for genome-wide scale phenome-wide G×E studies (PheWIS). SPAGE fits a genotype-independent logistic model only once across the whole-genome analysis to reduce computation cost and uses a saddlepoint approximation (SPA) to calibrate the test statistics for analysis of phenotypes with unbalanced case-control ratios. When genotypic effect is small or moderate (true for most of the variants), the method can control type I error rates well. We first test for the marginal genotypic effect (normal approximation if Beta.SPA=F and SPA if Beta.SPA=T) and if the p value is less than the pre-given argument 'BetaG.cutoff', we will update the test statistic and p value.

Value

an R matrix with the following columns

MAF

Minor allele frequencies

missing.rate

Missing rate

p.value.BetaG

p value of the marginal genotypic effect based on normal distribution approximation (BetaG.SPA=F) or saddlepoint approximation (BetaG.SPA=T)

p.value.spa-xx

p value of the marginal GxE effect based on saddlepoint approximation. xx is the name of the environmental factor

p.value.norm-xx

p value of the marginal GxE effect based on the normal distribution approximation. xx is the name of the environmental factor

p.value.Firth-xx

p value of the marginal GxE effect based on the Firth's test. xx is the name of the environmental factor. xx is the name of the environmental factor

Stat-xx

test statistic of the marginal GxE effect. xx is the name of the environmental factor

Var-xx

estimated variance of the marginal GxE effect. xx is the name of the environmental factor

z-xx

z score of the marginal GxE effect. xx is the name of the environmental factor

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# Specify all arguments
N = 1000
Data.ls = data.simu.null(N = N, nSNP = 10, nCov = 2, maf = 0.3, prev = 0.1)
subjectID = paste0("ID",1:N)
Phen.mtx = Data.ls$Phen.mtx
obj.null = SPAGE_Null_Model(y ~ Cov1 + Cov2, subjectID = subjectID, data = Phen.mtx, out_type = "D")
Envn.mtx = as.matrix(Phen.mtx)[,"Cov1",drop=FALSE]
Geno.mtx = Data.ls$Geno.mtx
rownames(Geno.mtx) = rownames(Envn.mtx) = subjectID
SPAGE(obj.null, Envn.mtx, Geno.mtx)

WenjianBI/SPAGE documentation built on Nov. 13, 2020, 12:15 p.m.