getGeneModels: Return sets of ranges for individual gene model components...

Description Usage Arguments Value

View source: R/annotate.r

Description

For custom analysis that requires the genomic ranges of gene model components.

Usage

1
2
3
getGeneModels(genes = getGenes(geneset = "ucsc", genome = genome, cachedir =
  cachedir), promoter = c(1000, 500), end3 = c(1000, 1000), genome,
  cachedir, sync = TRUE)

Arguments

genes

Genes of interest from the output table of getGenes(). If not given, will default to the UCSC knownGene table.

promoter

A numeric vector of length 2 specifying the number of bp upstream and downstream of transcription start sites for which to create promoter ranges. Given as c(upstream,downstream). Note that "upstream" in the context of the 5' end of the gene means out from the gene body.

end3

A numeric vector of length 2 specifying the number of bp upstream and downstream of transcription end sites for which to create gene 3' end ranges. Given as c(upstream,downstream). Note that "upstream" in the context of the 3' end of the gene means into the gene body.

genome

The UCSC name specific to the genome of the query coordinates (e.g. "hg19", "hg18", "mm10", etc)

cachedir

A path to a directory where a local cache of UCSC tables are stored. If equal to NULL (default), the data will be downloaded to temporary files and loaded on the fly. Caching is highly recommended to save time and bandwidth.

sync

If TRUE, then check if newer versions of UCSC tables are available and download them if so. If FALSE, skip this check. Can be used to freeze data versions in an analysis-specific cachedir for reproducibility.

Value

A list containing one GenomicRanges object for each of the gene model portions: Promoters, 3' Ends, Exons, Introns, Intergenic, 5' UTRs, 3' UTRs. The "srow" column can be used to match individual ranges to individual genes in the table given to the "genes" argument by row number.


jeffbhasin/goldmine documentation built on Nov. 13, 2019, 9:11 a.m.