README.md

VariantScan: a machine leaning tool for variant association testing

Introduction

This package provides a set of tools for performing association testing to identify QTLs in genome-wide association studies (GWAS,MWAS,EWAS,PWAS). It integrates three methods, Linear Model, Local Polynomial Fitting (Nonlinear Model) and Generalized Additive Model (GAM) to carry out genome-wide scanning. These methods can be also applied to case-control studies, where the ROC is used to assess the model performance.

Welcome any feedback and pull request.

Installation

Install the package from github:

library(devtools)

install_github("xinghuq/VariantScan")

library("VariantScan")

Testing the association between phenotypes and genotypes using genomic data

Get example file

f <- system.file('extdata',package='VariantScan')
infile <- file.path(f, "sim1.csv")
## read genotype file
geno=read.csv(infile)

# traits
traitq=geno[,14]
genotype=geno[,-c(1:14)]

# get PCs as covariates

PCs=prcomp(genotype)
PCs$x[,1:2]

## do Vscan using local polynomial regression fitting without specifying covariates

loessW=VScan(x=genotype,y=(traitq),methods ="loess")

## do Vscan using local polynomial regression fitting using PCs as covariates

loessWcv=VScan(x=genotype,y=(traitq),U=PCs$x[,1:2],methods ="loess")

## try linear model

lmW=VScan(x=genotype,y=(traitq),methods ="lm")
lmWcv=VScan(x=genotype,y=(traitq),U=PCs$x[,1:2],methods ="lm")

``````

## Visualizing the association signatures 

Plot Manhattan plot

``````
## 
Loci<-rep("Neutral", 1000)
Loci[c(201,211,221,231,241,251,261,271,281,291)]<-"QT"
Selected_Loci<-Loci[-which(Loci=="Neutral")]

library(ggplot2)
## Manhattan plot

g1=ggplot() +
  geom_point(aes(x=which(Loci=="Neutral"), y=-log10(lmWcv$p_norm$p.value[-which(Loci!="Neutral")])), col = "gray83") +
  geom_point(aes(x=which(Loci!="Neutral"), y=-log10(lmWcv$p_norm$p.value[-which(Loci=="Neutral")]), colour = Selected_Loci)) +
  xlab("SNPs") + ylab("-log10(p-value)") +ylim(c(0,35))+theme_bw()

g1


g2=ggplot() +
  geom_point(aes(x=which(Loci=="Neutral"), y=-log10(loessWcv$p_norm$p.value[-which(Loci!="Neutral")])), col = "gray83") +
  geom_point(aes(x=which(Loci!="Neutral"), y=-log10(loessWcv$p_norm$p.value[-which(Loci=="Neutral")]), colour = Selected_Loci)) +xlab("SNPs") + ylab("-log10(p-value)") +ylim(c(0,35))+theme_bw()

g3




Try the VariantScan package in your browser

Any scripts or data that you put into this service are public.

VariantScan documentation built on June 30, 2022, 5:05 p.m.