email: johntlovell@gmail.com -- website: lovelleeb.weebly.com -- github: github.com/jtlovell/qtlTools
To search for candidate genes you need four objects.
To infer the potential of a candidate gene you need at least one of 7 datasets.
With these data, one can infer whether a gene is likely to contain the causal QTN(s)
\newpage
knitr::opts_chunk$set(echo = TRUE, fig.width = 5.5, fig.height = 3.5, warning = FALSE) library(knitr)
To start, you need the qtlTools package. Get it from github.
library(devtools) install_github("jtlovell/qtlTools") library(qtlTools)
Load the multitrait data from R/qtl
data(multitrait)
Create some fake physical positions of the markers allowing for low recombination in the middle of the chromosomes (as would be expected in the pericentromeric region)
map<-pullMap(multitrait) map$bp<-0 for(i in unique(map$chr)){ n<-sum(map$chr==i) p<-sin((1:n/n)*pi) map$bp[map$chr==i]<-cumsum(p*1000000) }
Create a fake gff file
gff<-data.frame(chr = rep(paste0("scaffold_",1:5),each = 200), feature = rep("gene",1000), start = rep(seq(from = 0, to = max(map$bp), length = 200), 5), end = rep(seq(from = 0, to = max(map$bp), length = 200))+1000, strand = rep("+",1000), attribute = paste0("gene",1:1000,";","gene",1:1000,".1"), stringsAsFactors=F)
\newpage
geneCM<-findGenecM(cross = multitrait, marker.info = map, gff = gff, gffCols = c("chr","feature","start","end","strand","attribute"))
Plots showing the bp/cM patterns
par(mfrow=c(2,3)) for(i in unique(map$chr)){ with(geneCM[geneCM$chr==i,], plot(pos,bp, col="grey", ylab = "physical position (bp)", xlab = "mapping position (cM)")) with(map[map$chr==i,], points(pos,bp, col=i, pch = 19, cex=.8)) }
\newpage
Make qtl intervals
multitrait<-calc.genoprob(multitrait) s1<-scanone(multitrait, method="hk", pheno.col=1) perm<-scanone(multitrait, n.perm=100, method="hk",pheno.col=1, verbose=FALSE) cis<-calcCis(cross = multitrait, s1.output=s1, perm.output=perm, drop=5) par(mfrow = c(1,1)) plot(s1) segmentsOnPeaks(multitrait, s1.output=s1, calcCisOutput = cis, int.y = 13.1)
Pull out genes in the intervals
candGenes<-findGenesInterval(findGenecM.output = geneCM, calcCis.output = cis) print(candGenes)
\newpage
There are a number of approaches to define how likely any gene is to be the candidate.
Here, we will explore the 4th option. First, we must simulate some gene expression data: Let's just focus on the chromosome 5 QTL and say there are 50 genes under the QTl interval. Here, we simulate expression (normalized around 0) with a few genes with correlated expression with the focal marker.
cross<-subset(multitrait, ind = !is.na(pull.pheno(multitrait, 1))) phe<-pull.pheno(cross, 1) mult.fact<-exp(seq(from = 0, to = 50, length.out = 50)) facs<-sapply(1:50, function(x){ scale(sapply(scale(phe), function(y) rnorm(n = 1, mean = y, sd = mult.fact[x]))) }) plot(sapply(1:50, function(x) cor(phe, facs[,x])), ylab = "cor. coef. (expression ~ chr5 QTL genotype)", xlab = "gene id")
In this simplistic example, we are simulating a case where expression drives a linear, additive QTL effect. However, this approach is extensible and permits inference of QTL*Treatment, epistasis, multiple QTL models and other cases.
So, lets run the covariate scan, testing how the QTL profile is affected by the presence of gene expression in the model.
expression.covariates = facs colnames(expression.covariates)<-paste0("gene",1:ncol(expression.covariates)) pheno.col = 1 qtl = makeqtl(cross, chr = max(s1)$chr, pos = max(s1)$pos, what = "prob") test_additive<-covScanQTL(cross = cross, pheno.col = 1, qtl = qtl, expression.covariates = expression.covariates, qtl.method = "hk", nperm = 20) kable(head(test_additive), caption = "top candidate genes using a simple, additive covariate scan approach.")
Now include another QTL in the model and enforce an epistatic interaction between the 2nd (focal) and 1st QTL.
qtl2 = makeqtl(cross, chr = summary(s1)$chr[4:5], pos = summary(s1)$pos[4:5], what = "prob") test_epistasis<-covScanQTL(cross = cross, pheno.col = 1, qtl = qtl2, which.epiqtl = 1, focalqtl.index = 2, expression.covariates = expression.covariates, qtl.method = "hk", nperm = 20) kable(head(test_epistasis), caption = "top candidate genes using a covariate scan that incorporates epistasis")
Now a little more complexity with an F2 cross and experimental covariates
data(fake.f2) cross<-fake.f2 phe<-pull.pheno(cross, 1) mult.fact<-exp(seq(from = 0, to = 50, length.out = 50)) facs<-sapply(1:50, function(x){ scale(sapply(scale(phe), function(y) rnorm(n = 1, mean = y, sd = mult.fact[x]))) }) facs<-data.frame(facs) colnames(facs)<-paste0("gene",1:ncol(facs)) cross<-calc.genoprob(cross) sex = data.frame(sex = pull.pheno(cross, pheno.col = "sex")) s1<-scanone(cross, addcovar = sex) qtl = makeqtl(cross, chr = summary(s1)$chr[1], pos = summary(s1)$pos[1], what = "prob") test_QTLxE<-covScanQTL(cross = cross, pheno.col = 1, qtl = qtl, addcovar = sex, intcovar = sex, expression.covariates = facs, qtl.method = "hk", nperm = 20) kable(head(test_QTLxE), caption = "top candidate genes using a covariate scan that incorporates GxE")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.