mapPeaks: Map peaks to closest features

Description Usage Arguments Details Value References Examples

View source: R/peakMapping.R

Description

Annotate a set of genomic coordinates by finding the closest genomic features supplied by the user. For mapping purposes, peaks and features without strand information (i.e strand of ".") are assumed to be on the forward strand. Peaks are considered to be a distance of 0 away from any features they overlap with. A single peak will map to multiple features if these features are the same distance away. In this case, the same peak will appear in the output multiple times.

Usage

1
mapPeaks(peakFrame, featureFrame, verbose = F)

Arguments

peakFrame

A dataframe returned by the importBED function, where each row is a different peak to be annotated.

featureFrame

A dataframe returned by the importBED function, where each row is a different feature to be used for peak annotation.

verbose

If True, mapPeaks will print to the console every time it begins processing a new peak (this is useful for monitoring function calls on large datasets that may take a while to finish). Will also print to the console peaks that do not map onto any features (e.g. due to differing chromosomes).

Details

Aside from overlap percentages, the structure of the returned dataframe is based on the annotation output produced by ChIPseeker (see References), although no code here is actually taken from ChIPseeker. Use of dplyr's bind_rows function to combine a list of dataframes into 1 dataframe is based off of StackOverflow post by Joe Klieg (see References).

Value

A dataframe giving the nearest feature(s) in featureFrame to each peak in peakFrame. Every row in the returned dataframe gives a peak, a feature of minimal distance, and the distance and position of the peak relative to the gene, where position is some combination of upstream, downstream, and overlapping (see documentation of getDistanceToPeak for further details on position). The final column gives the percentage of the feature that is overlapped by the mapped peak. The same peak may appear in multiple consecutive rows if the peak maps onto multiple different genes (e.g. in the case of multiple overlaps). Peaks are not included in the output if there are no features on the same chromosome. Returns NULL if none of the peaks map onto any features (due to being on different chromosomes).

References

Guangchuang Yu, Li-Gen Wang, and Qing-Yu He. ChIPseeker: an R/Bioconductor package for ChIP peak annotation, comparison and visualization. Bioinformatics 2015, 31(14):2382-2383

Joe Klieg. "Convert a list of data frames into one data frame". 27 February 2018. Accessed 25 September 2019. https://stackoverflow.com/a/49017065

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
## Not run: 
  pathToPeaks <- system.file("extdata",
     "H3K27me3Peaks.bed", package = "PeakMapper")
  pathToGenes <- system.file("extdata",
     "WS263Genes.bed", package = "PeakMapper")
  H3K27me3Peaks <- importBED(pathToPeaks)
  WS263Genes <- importBED(pathToGenes)
  mappingResults <- mapPeaks(H3K27me3Peaks, WS263Genes)
  mappingResults$Peak_Position
  mappingResults$Peak_Distance

## End(Not run)

fuscada2/PeakMapper documentation built on Dec. 8, 2019, 12:35 p.m.