gene_annotate: Annotate peak file based on gene.

Description Usage Arguments Value Examples

View source: R/gene_annotate.r

Description

Annotate a user's peaks file (which has been preprocessed with the peaksInput() command) with gene information based on optimally chosen geneXtendeR upstream extension file and compresses the annotations based on genes. This command requires a preprocessed "peaks.txt" file (generated using peaksInput()) to be present in the user's working directory, otherwise the user is prompted to rerun the peaksInput() command in order to regenerate it.

Usage

1
gene_annotate(organism, extension)

Arguments

organism

Object name assigned from readGFF() command.

extension

Desired upstream extension.

Value

The gene coordinates are extended by 'extension' at the 5-prime end, and by 500 bp at the 3-prime end. The peaks file is then overlayed on these new gene coordinates, producing a file of peaks annotated with gene ID, gene name, gene location, mean and standard deviation of peaks-to-genes, number of peaks-to-genes, and peak-to-gene genomic distance (in bp). Distance is calculated between 5-prime end of gene and 3-prime end of peak.

A data.table formatted version of the gene-annotated file for checking or further calculations.

(From annotate.r) The gene coordinates are extended by 'extension' at the 5-prime end, and by 500 bp at the 3-prime end. The peaks file is then overlayed on these new gene coordinates, producing a file of peaks annotated with gene ID, gene name, and gene-to-peak genomic distance (in bp). Distance is calculated between 5-prime end of gene and 3-prime end of peak.

Examples

1
2
3
4
5
library(rtracklayer)
rat <- readGFF("ftp://ftp.ensembl.org/pub/release-84/gtf/rattus_norvegicus/Rattus_norvegicus.Rnor_6.0.84.gtf.gz")
fpath <- system.file("extdata", "somepeaksfile.txt", package="geneXtendeR")
peaksInput(fpath)
gene_annotate(rat, 3400)

geneXtendeR documentation built on Nov. 8, 2020, 11:09 p.m.