makeGeneRepresentation: Compute a gene representation from annotation.

Description Usage Arguments Details Value Author(s) Examples

View source: R/annotation.R

Description

Computing a gene representation from annotation using a variety of methods.

Usage

1
2
makeGeneRepresentation(annoData, type = c("UIgene", "Ugene", "ROCE",
"background"), gene.id = "ensembl_gene_id", transcript.id = "ensembl_transcript_id", bind.columns, ignoreStrand = TRUE, verbose = getOption("verbose"))

Arguments

annoData

A data frame which must contain the columns chr, start, end and strand which specifies annotation regions of interest, and optionally additional columns.

type

The type of gene representation, see details.

gene.id

The column in annoData that holds the gene identifiers (only needed for certain types of representation).

transcript.id

The column in annoData that holds the transcript identifiers (only needed for certain types of representation).

bind.columns

A character vector of column names that will be kept in the return object. It is assumed (but not checked) that these values are constant for all regions in a gene.

ignoreStrand

Is strand ignored? Little testing has been done for the value 'TRUE'.

verbose

Want verbose output?

Details

A union representation (Ugene) is simply the union of all bases of all transcripts of the gene, with bases belonging to other genes removed.

A union-intersection representation (UIgene) for a gene is defined as bases that are annotated as belonging to all transcripts of the gene, and not to any other gene.

Regions of constant expression (ROCE) are regions where one would assume that the expression is constant. They are best explained by an example: if transcript A goes from 1 to 4 and transcript B goes from 1 to 6 there are two ROCEs, one from 1 to 4 and one from 5 to 6. It is possible to define ROCEs independent of the gene concept, but in its current implementation regions belonging to more than one gene are removed.

Background is essentially the complement of the annotation.

Value

A data.frame with rownames and columns chr, strand, start, end, and possibly additional columns.

Author(s)

James Bullard bullard@berkeley.edu, Kasper Daniel Hansen khansen@jhsph.edu

Examples

1
2

Genominator documentation built on Oct. 31, 2019, 8:56 a.m.