Intro to the qqman package

opts_chunk$set(comment=NA, fig.width=12, fig.height=9, message=FALSE, tidy=TRUE, dpi=75)

Intro to the qqman package

# This code used to generate the test data. Runs slow, but does the job.
chrstats <- data.frame(chr=1:22, nsnps=1500)
chrstats$nsnps <- with(chrstats, round(nsnps/chr^(1/3)))

d <- data.frame(SNP=rep(NA, sum(chrstats$nsnps)), 
                CHR=rep(NA, sum(chrstats$nsnps)), 
                BP=rep(NA, sum(chrstats$nsnps)), 
                P=rep(NA, sum(chrstats$nsnps)))
snpi <- 1
for (i in chrstats$chr) {
    for (j in 1:chrstats[i, 2]) {
        d[snpi, ]$SNP=paste0("rs", snpi)
        d[snpi, ]$CHR=i
        d[snpi, ]$BP=j
        d[snpi, ]$P=runif(1)
        snpi <- snpi+1

divisor <- c(seq(2,50,2), seq(50,2,-2))
divisor <- divisor^4
d[3026:3075, ]$P <- d[3026:3075, ]$P/divisor
snpsOfInterest <- paste0("rs", 3001:3100)
manhattan(d, highlight=snpsOfInterest)
gwasResults <- d
save(gwasResults, file="data/gwasResults.RData")

The qqman package includes functions for creating manhattan plots and q-q plots from GWAS results. The gwasResults data.frame included with the package has simulated results for 16,470 SNPs on 22 chromosomes. Take a look at the data:


How many SNPs on each chromosome?$CHR))

Creating manhattan plots

Now, let's make a basic manhattan plot.


We can also pass in other graphical parameters. Let's add a title (main=), increase the y-axis limit (ylim=), reduce the point size to 60% (cex=), and reduce the font size of the axis labels to 90% (cex.axis=). While we're at it, let's change the colors (col=), remove the suggestive and genome-wide significance lines, and supply our own labels for the chromosomes:

manhattan(gwasResults, main="Manhattan Plot", ylim=c(0,10), cex=0.6, cex.axis=0.9, col=c("blue4", "orange3"), suggestiveline=F, genomewideline=F, chrlabs=c(1:20, "P", "Q"))

Now, let's look at a single chromosome:

manhattan(subset(gwasResults, CHR==1))

Let's highlight some SNPs of interest on chromosome 3. The 100 SNPs we're highlighting here are in a character vector called snpsOfInterest. You'll get a warning if you try to highlight SNPs that don't exist.

manhattan(gwasResults, highlight=snpsOfInterest)

We can combine highlighting and limiting to a single chromosome, and use the xlim graphical parameter to zoom in on a region of interest (between position 200-500):

manhattan(subset(gwasResults, CHR==3), highlight=snpsOfInterest, xlim=c(200, 500), main="Chr 3")

We can also annotate SNPs based on their p-value. By default, this only annotates the top SNP per chromosome that exceeds the annotatePval threshold.

manhattan(gwasResults, annotatePval=0.01)

We can also annotate all SNPs that meet a threshold:

manhattan(gwasResults, annotatePval=0.005, annotateTop=FALSE)

Finally, the manhattan function can be used to plot any value, not just p-values. Here, we'll simply call the function passing to the p= argument the name of the column we want to plot instead of the default "P" column. In this example, let's create a test statistic ("zscore"), plot that instead of p-values, change the y-axis label, and remove the default log transformation. We'll also remove the genomewide and suggestive lines because these are only meaningful if you're plotting -log10(p-values).

# Add test statistics
gwasResults <- transform(gwasResults, zscore=qnorm(P/2, lower.tail=FALSE))

# Make the new plot
manhattan(gwasResults, p="zscore", logp=FALSE, ylab="Z-score", genomewideline=FALSE, suggestiveline=FALSE, main="Manhattan plot of Z-scores")

A few notes on creating manhattan plots:

Creating Q-Q plots

Creating Q-Q plots is straightforward - simply supply a vector of p-values to the qq() function.


We can otionally supply many other graphical parameters.

qq(gwasResults$P, main="Q-Q plot of GWAS p-values",
   xlim=c(0,7), ylim=c(0,12), pch=18, col="blue4", cex=1.5, las=1)

Try the qqman package in your browser

Any scripts or data that you put into this service are public.

qqman documentation built on April 19, 2021, 9:05 a.m.