mapByFeature | R Documentation |
Map Genomic Ranges to genes using defined regulatory features
mapByFeature(
gr,
genes,
prom,
enh,
gi,
cols = c("gene_id", "gene_name", "symbol"),
gr2prom = 0,
gr2enh = 0,
gr2gi = 0,
gr2gene = 1e+05,
prom2gene = 0,
enh2gene = 1e+05,
gi2gene = 0,
...
)
gr |
GRanges object with query ranges to be mapped to genes |
genes |
GRanges object containing genes (or any other nominal feature) to be assigned |
prom |
GRanges object defining promoters |
enh |
GRanges object defining Enhancers |
gi |
GInteractions object defining interactions. Mappings from interactions to genes should be performed as a separate prior step. |
cols |
Column names to be assigned as mcols in the output. Columns
must be minimally present in |
gr2prom |
The maximum permissible distance between a query range and any ranges defined as promoters |
gr2enh |
The maximum permissible distance between a query range and any ranges defined as enhancers |
gr2gi |
The maximum permissible distance between a query range and any ranges defined as GInteraction anchors |
gr2gene |
The maximum permissible distance between a query range and genes (for ranges not otherwise mapped) |
prom2gene |
The maximum permissible distance between a range provided
in |
enh2gene |
The maximum permissible distance between a range provided
in |
gi2gene |
The maximum permissible distance between a GInteractions
anchor (provided in |
... |
Passed to findOverlaps and overlapsAny internally |
This function is able to utilise feature-level information and long-range interactions to enable better mapping of regions to genes. If provided, this essentially maps from ranges to genes using the regulatory features as a framework. The following sequential strategy is used:
Ranges overlapping a promoter are assigned to that gene
Ranges overlapping an enhancer are assigned to all genes within a specified distance
Ranges overlapping a long-range interaction are assigned to all genes connected by the interaction
Ranges with no gene assignment from the previous steps are assigned to all overlapping genes or the nearest gene within a specified distance
If information is missing for one of these steps, the algorithm will simply proceed to the next step. If no promoter, enhancer or interaction data is provided, all ranges will be simply mapped by step 4. Ranges can be mapped by any or all of the first three steps, but step 4 is mutually exclusive with the first 3 steps.
Distances between each set of features and the query range can be
individually specified by modifying the gr2prom
, gr2enh
, gr2gi
or
gr2gene
parameters. Distances between features and genes can also be set
using the parameters prom2gene
, enh2gene
and gi2gene
.
Additionally, if previously defined mappings are included with any of the
prom
, enh
or gi
objects, this will be used in preference to any
obtained from the genes
object.
A GRanges object with added mcols as specified
## Define some genes
genes <- GRanges(c("chr1:2-10:*", "chr1:25-30:-", "chr1:31-40:+"))
genes$gene_id <- paste0("gene", seq_along(genes))
genes
## Add a promoter for each gene
prom <- promoters(genes, upstream = 1, downstream = 1)
prom
## Some ranges to map
gr <- GRanges(paste0("chr1:", seq(0, 60, by = 15)))
gr
## Map so that any gene within 25bp of the range is assigned
mapByFeature(gr, genes, gr2gene = 25)
## Now use promoters to be more accurate in the gene assignment
## Given that the first range overlaps the promoter of gene1, this is a
## more targetted approach. Similarly for the third range
mapByFeature(gr, genes, prom, gr2gene = 25)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.