VariantScan: a machine leaning tool for genetic variant association testing "

Instruction

This package provides a set of tool for performing association tests to identify (e)QTLs in genome-wide association studies (GWAS,MWAS,EWAS,PWAS). It integrates three methods, Linear Model, Local Polynomial Fitting (Nonlinear Model) and Generalized Additive Model (GAM) to carry out genome-wide scanning. These methods can be also applied to case-control studies, where the ROC is used to assess the model performance.

This vignette demonstrates the utility of "VariantScan" for association testing in genomics. In this vignette, we take the simulated genotypes of 640 individuals as an example. We compared two methods, linear regression and local polynomial regression for identifying QTLs.

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

Install VariantScan

#Install from CRAN
#install.packages("VariantScan")
## or you can get the latest version of HierDpart from github
#library(devtools)

#install_github("xinghuq/VariantScan")
library("ggplot2")
library("VariantScan")

Load example data

# example genepop file
f <- system.file('extdata',package='VariantScan')
infile <- file.path(f, "sim1.csv")
## read genotype file
geno=read.csv(infile)

# traits
traitq=geno[,14]
genotype=geno[,-c(1:14)]

# get PCs as covariates

PCs=prcomp(genotype)
summary(PCs$x[,1:2])

Doing Vscan using local polynomial regression fitting without specifying covariates

loessW=VScan(x=genotype,y=(traitq),methods ="loess")

Doing Vscan using local polynomial regression fitting using PCs as covariates

loessWcv=VScan(x=genotype,y=(traitq),U=PCs$x[,1:2],methods ="loess")

try linear model

lmW=VScan(x=genotype,y=(traitq),methods ="lm")

lmWcv=VScan(x=genotype,y=(traitq),U=PCs$x[,1:2],methods ="lm")

Visualizing the association signatures

Plot Manhattan plot

Linear model

## 
Loci<-rep("Neutral", 1000)
Loci[c(201,211,221,231,241,251,261,271,281,291)]<-"QTL"
Selected_Loci<-Loci[-which(Loci=="Neutral")]


g1=ggplot() +
  geom_point(aes(x=which(Loci=="Neutral"), y=-log10(lmWcv$p_norm$p.value[-which(Loci!="Neutral")])), col = "gray83") +
  geom_point(aes(x=which(Loci!="Neutral"), y=-log10(lmWcv$p_norm$p.value[-which(Loci=="Neutral")]), colour = Selected_Loci)) +
  xlab("SNPs") + ylab("-log10(p-value)") +ylim(c(0,35))+theme_bw()

g1

Fig.1 Manhattan plot for linear model

Local polynomial regression model

## Manhattan plot


g2=ggplot() +
  geom_point(aes(x=which(Loci=="Neutral"), y=-log10(loessWcv$p_norm$p.value[-which(Loci!="Neutral")])), col = "gray83") +
  geom_point(aes(x=which(Loci!="Neutral"), y=-log10(loessWcv$p_norm$p.value[-which(Loci=="Neutral")]), colour = Selected_Loci)) +
  xlab("SNPs") + ylab("-log10(p-value)") +ylim(c(0,35))+theme_bw()

g2

Fig.2 Manhattan plot for local polynomial regression model

References

W. S. Cleveland, E. Grosse and W. M. Shyu (1992) Local regression models. Chapter 8 of Statistical Models in S eds J.M. Chambers and T.J. Hastie, Wadsworth & Brooks/Cole.



Try the VariantScan package in your browser

Any scripts or data that you put into this service are public.

VariantScan documentation built on June 30, 2022, 5:05 p.m.