knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/annotations-",
  fig.width = 6,
  fig.asp = 0.618,
  out.width = "70%",
  fig.align = "center"
)
library(knitr)

Annotations for Top Association

Load annotation information for switchgrass

The function pvdiv_table_topsnps creates dataframes containing annotation information for the switchgrass diversity panel. To use this function, first load the provided annotation information from wherever you have saved it (currently, this is version 5.1 of the annotation information for Panicum virgatum):

library(AnnotationDbi)
library(VariantAnnotation)
txdb <- loadDb(file = "Pvirgatum_516_v5.1.gene.txdb.sqlite")

Load SNP dataset and GWAS results

We will use the same toy SNP dataset that we used for genome-wide association, and find annotations for the same phenotype, GWAS_CT.

library(bigsnpr)

snpfile <- system.file("extdata", "example_bigsnp.rds", package = "switchgrassGWAS")
gwasfile <- system.file("extdata", "GWAS_datatable_example_GWAS_CT.rds", 
                        package = "switchgrassGWAS")

gwas <- readRDS(gwasfile)
snp <- snp_attach(snpfile)

Then, specify the number of top SNPs you want to find annotation information for, a FDR threshold to find annotation information for, and a set of distances away from the associated SNP (in bp) for which to pull annotations. Here, we find genes 1kb and 20kb away from 1) top 10 associations and 2) associations above a FDR of 10%.

#library(AnnotationDbi)
#library(VariantAnnotation)
#library(switchgrassGWAS)
#txdb <- loadDb(file = "../../pvdiv-genome/Pvirgatum_516_v5.1.gene.txdb.sqlite")

anno_file <- system.file("extdata", "Annotation_tables_GWAS_CT.rds", 
                         package = "switchgrassGWAS")
anno_tables <- readRDS(anno_file)
anno_tables <- pvdiv_table_topsnps(df = gwas, type = "bigsnp", n = 10, 
                                  FDRalpha = 0.1, rangevector = c(1000, 20000),
                                  snp = snp, txdb = txdb)

Let's look at the annotations for genes within 1000bp of SNPs above the 10% FDR cutoff. In our toy example, there are seven of these:

kable(anno_tables[[2]])

pvdiv_table_topsnps will return a list of all of the tables you requested, named according to the criteria used to create the table. You can then save these tables individually as csv or any other type of file.

If the resultant tables are small, say, less than 2000 entries apiece, I favor saving these tables as individual sheets in an Excel file using the R package XLConnect.

library(XLConnect)

wb1 <- loadWorkbook(filename = "Annotation_tables_GWAS_CT.xlsx", 
                    create = TRUE)
for(j in seq_along(anno_tables)){
  createSheet(wb1, name = names(anno_tables)[j])
  writeWorksheet(wb1, anno_tables[[j]], sheet = names(anno_tables)[j])
  }
saveWorkbook(wb1)


Alice-MacQueen/switchgrassGWAS documentation built on Jan. 23, 2022, 7:55 p.m.