get_region_target_gene: Obtain target genes of input regions based on distance

View source: R/get_region_target_gene.R

get_region_target_geneR Documentation

Obtain target genes of input regions based on distance

Description

To map an input region to genes there are three options: 1) map region to closest gene tss 2) map region to all genes within a window around the region (default window.size = 500kbp (i.e. +/- 250kbp from start or end of the region)). 3) map region to a fixed number of nearby genes (upstream/downstream)

Usage

get_region_target_gene(
  regions.gr,
  genome = c("hg38", "hg19"),
  method = c("genes.promoter.overlap", "window", "nearby.genes", "closest.gene.tss"),
  promoter.upstream.dist.tss = 2000,
  promoter.downstream.dist.tss = 2000,
  window.size = 500 * 10^3,
  num.flanking.genes = 5,
  rm.promoter.regions.from.distal.linking = TRUE
)

Arguments

regions.gr

A Genomic Ranges object (GRanges) or a SummarizedExperiment object (rowRanges will be used)

genome

Human genome of reference "hg38" or "hg19"

method

How genes are mapped to regions: region overlapping gene promoter ("genes.promoter.overlap"); or genes within a window around the region ("window"); or a fixed number genes upstream and downstream of the region ("nearby.genes"); or closest gene tss to the region ("closest.gene.tss")

promoter.upstream.dist.tss

Number of base pairs (bp) upstream of TSS to consider as promoter regions. Defaults to 2000 bp.

promoter.downstream.dist.tss

Number of base pairs (bp) downstream of TSS to consider as promoter regions. Defaults to 2000 bp.

window.size

When method = "window", number of base pairs to extend the region (+- window.size/2). Default is 500kbp (or +/- 250kbp, i.e. 250k bp from start or end of the region)

num.flanking.genes

When method = "nearby.genes", set the number of flanking genes upstream and downstream to search.Defaults to 5. For example, if num.flanking.genes = 5, it will return the 5 genes upstream and 5 genes downstream of the given region.

rm.promoter.regions.from.distal.linking

When performing distal linking with method = "windows", "nearby.genes" or "closest.gene.tss", if set to TRUE (default), probes in promoter regions will be removed from the input.

Details

For the analysis of probes in promoter regions (promoter analysis), we recommend setting method = "genes.promoter.overlap".

For the analysis of probes in distal regions (distal analysis), we recommend setting either method = "window" or method = "nearby.genes".

Note that because method = "window" or method = "nearby.genes" are mainly used for analyzing distal probes, by default rm.promoter.regions.from.distal.linking = TRUE to remove probes in promoter regions.

Value

A data frame with the following information: regionID, Target symbol, Target ensembl ID

Examples

library(GenomicRanges)
library(dplyr)

# Create example region
regions.gr <- data.frame(
       chrom = c("chr22", "chr22", "chr22", "chr22", "chr22"),
       start = c("39377790", "50987294", "19746156", "42470063", "43817258"),
       end   = c("39377930", "50987527", "19746368", "42470223", "43817384"),
       stringsAsFactors = FALSE)  %>%
     makeGRangesFromDataFrame

 # map to closest gene tss
 region.genes.promoter.overlaps <- get_region_target_gene(
                      regions.gr = regions.gr,
                      genome = "hg19",
                      method = "genes.promoter.overlap"
 )

 # map to all gene within region +- 250kbp
 region.window.genes <- get_region_target_gene(
                      regions.gr = regions.gr,
                      genome = "hg19",
                      method = "window",
                      window.size = 500 * 10^3
 )

 # map regions to n upstream and n downstream genes
 region.nearby.genes <- get_region_target_gene(
                      regions.gr = regions.gr,
                      genome = "hg19",
                      method = "nearby.genes",
                      num.flanking.genes = 5
 )

TransBioInfoLab/MethReg documentation built on July 28, 2023, 9:17 p.m.