avs.pevse: An EVSE pipeline using precomputed eQTLs.

Description Usage Arguments Value Author(s) See Also Examples

Description

The VSE method (avs.vse) provides a robust framework to cope with the heterogeneous structure of haplotype blocks, and has been designed to test enrichment in cistromes and epigenomes. In order to extend the variant set enrichment to genes this pipeline implements an additional step using precomputed expression quantitative trait loci (eQTLs).

Usage

1
2
3
avs.pevse(object, annotation, eqtls, maxgap=250, pValueCutoff=0.05, 
pAdjustMethod="bonferroni", boxcox=TRUE, lab="annotation", glist=NULL, 
minSize=100, verbose=TRUE)

Arguments

object

an object of class 'AVS' (see AVS-class).

annotation

a data frame with genomic annotations listing chromosome coordinates to which a particular property or function has been attributed. It should include the following columns: <CHROM>, <START>, <END> and <ID>. The <ID> column can be any genomic identifier, while values in <CHROM> should be listed in ['chr1', 'chr2', 'chr3' ..., 'chrX']. Both <START> and <END> columns correspond to chromosome positions mapped to the human genome assembly used to build the AVS object.

eqtls

object of class 'data.frame' with at least two columns, including the following column names: <RSID> and <GENEID>.

maxgap

a single integer value specifying the max distant (kb) used to assign the eQTLs in the 'eqtls' object.

pValueCutoff

a single numeric value specifying the cutoff for p-values considered significant.

pAdjustMethod

a single character value specifying the p-value adjustment method to be used (see 'p.adjust' for details).

boxcox

a single logical value specifying to use Box-Cox procedure to find a transformation of the null that approaches normality (when boxcox=TRUE) or not (when boxcox=FALSE). See powerTransform and bcPower.

lab

a single character value specifying a name for the annotation dataset (this option is overrided if 'glist' is used).

glist

an optional list with character vectors mapped to the 'annotation' data via <ID> column. This option can be used to run a batch mode for gene sets and regulons.

minSize

if 'glist' is provided, this argument is a single integer or numeric value specifying the minimum number of elements for each gene set in the 'glist'. Gene sets with fewer than this number are removed from the analysis. if 'fineMapping=FALSE', an alternative min size value can be provided as a vector of the form c(minSize1, minSize2) used to space the null distributions (see 'fineMapping').

verbose

a single logical value specifying to display detailed messages (when verbose=TRUE) or not (when verbose=FALSE).

Value

a data frame in the slot "results", see 'what' options in avs.get.

Author(s)

Mauro Castro, Steve Booth

See Also

AVS-class

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
## Not run: 

# This example requires the RTNdata package! (currently available under request)
library(RTNdata.LDHapMapRel27.hg18)
library(Fletcher2013b)
library(TxDb.Hsapiens.UCSC.hg18.knownGene)

##################################################
### Example of EVSE analysis for sets of genomic 
### annotations (e.g. regulons, gene sets, etc.)
##################################################

#--- step 1: load a precomputed AVS
data(bcavs, package="RTNdata.LDHapMapRel27.hg18")

#--- step 2: load genomic annotation for all genes
genemap <- as.data.frame(genes(TxDb.Hsapiens.UCSC.hg18.knownGene))
genemap <- genemap[,c("seqnames","start","end","gene_id")]
colnames(genemap) <- c("CHROM","START","END","ID")

#--- step 3: load a TNI object (or any other source of regulons)
#--- and prepare a gene set list.
#--- Note: gene ids should be the same as in the 'genemap' object.
data("rtni1st")
glist <- tni.get(rtni1st,what="refregulons",idkey="ENTREZ")
glist <- glist[ c("FOXA1","GATA3","ESR1") ] #reduce the list for demonstration!

#--- step 4: load precomputed eQTLs
#--- Please note that the input data should represent eQTLs from genome-wide
#--- calls, that is, the universe size should cover 'all SNPs' vs. 'all genes'.
#--- The correct represetation of universe size is essential to build the
#--- null distributions. Or, to put it another way, the eQTL analysis 
#--- should unbiasedly test linked and random markers from the AVS.
#--- In this example it represents 2029 + 966240 SNPs:

lkMarkers <- avs.get(bcavs,what="linkedMarkers")
length(lkMarkers) # i.e. 2029 risk associated and linked SNPs 

rdMarkers <- avs.get(bcavs,what="randomMarkers")
length(rdMarkers) # i.e. 966240 random SNPs

#--- Now we prepare a 'toy' dataset for demonstration purposes only
#--- by picking (naively) SNPs within 250 kb window around the 
#--- genomic annotation.

## load HapMap SNPs (release 27) mapped to hg18
data("popsnp2")

## map SNPs to the genomic annotation
query <- with(popsnp2, 
              GRanges(chrom, IRanges(position, position), 
              id=rsid)
              )
subject <- with(genemap, 
                GRanges(CHROM, IRanges(START, END), 
                id=ID)
                )
hits <- findOverlaps(query,subject, maxgap = 250000)

## reduce 'hits' just for demonstration
hits <- hits[sort(sample(length(hits), 50000))]

## build a 'toy_eqtls' data frame
toy_eqtls <- data.frame(rsid = popsnp2$rsid[from(hits)], 
                        geneid = genemap$ID[to(hits)])

#--- step 5: run the 'avs.pevse' pipeline
#--- important: set 'maxgap' to the same searching window used 
#--- in the dQTL analysis (e.g. 250kb)
bcavs <- avs.pevse(bcavs, annotation=genemap, glist=glist, 
                   eqtls=toy_eqtls, maxgap = 250)

#--- step 6: generate the pEVSE plots
avs.plot2(bcavs,"pevse",height=2.5)


#####################################################
#--- parallel version for 'step 5' with SNOW package!
# library(snow)
# options(cluster=snow::makeCluster(3, "SOCK"))
# bcavs <- avs.pevse(bcavs, annotation=genemap, glist=glist, 
#          eqtls=toy_eqtls, maxgap = 250)
# stopCluster(getOption("cluster"))

## ps. as a technical note, the parallel version uses a
## slightly different overlap-based operation, which might
## bring slightly different counts depending on the data 
## input organization


## End(Not run)

RTN documentation built on Nov. 12, 2020, 2:02 a.m.