makeBindingSites: Define equally sized binding sites from peak calling results...

View source: R/bindingsites.R

makeBindingSitesR Documentation

Define equally sized binding sites from peak calling results and iCLIP crosslink events.

Description

This function performs the merging of single nucleotide crosslink sites into binding sites of a user defined width (bsSize). Depending on the desired output width crosslink sites with a distance closer than bsSize -1 are concatenated. Initially all input regions are concatenated and then imperatively merged and extended. Concatenated regions smaller than minWidth are removed prior to the merge and extension routine. This prevents outlier crosslink pileup, eg. mapping artifacts to be integrated into the final binding sites. All remaining regions are further processed and regions larger than the desired output width are interactively split up by setting always the position with the highest number of crosslinks as center. Regions smaller than the desired width are symmetrically extended. Resulting binding sites are then filtered by the defined constraints.

Usage

makeBindingSites(
  object,
  bsSize = NULL,
  minWidth = 2,
  minCrosslinks = 2,
  minClSites = 1,
  centerIsClSite = TRUE,
  centerIsSummit = TRUE,
  sub.chr = NA,
  quiet = FALSE
)

Arguments

object

a BSFDataSet object (see BSFDataSet)

bsSize

an odd integer value specifying the size of the output binding sites

minWidth

the minimum size of regions that are subjected to the iterative merging routine, after the initial region concatenation.

minCrosslinks

the minimal number of positions to overlap with at least one crosslink event in the final binding sites

minClSites

the minimal number of crosslink sites that have to overlap a final binding site

centerIsClSite

logical, whether the center of a final binding site must be covered by an initial crosslink site

centerIsSummit

logical, whether the center of a final binding site must exhibit the highest number of crosslink events

sub.chr

chromosome identifier (eg, chr1, chr2) used for subsetting the BSFDataSet object. This option can be used for testing different parameter options

quiet

logical, whether to print info messages

Details

The bsSize argument defines the final output width of the merged binding sites. It has to be an odd number, to ensure that a binding site has a distinct center.

The minWidth parameter is used to describe the minimum width a ranges has to be after the initial concatenation step. For example: Consider bsSize = 9 and minWidth = 3. Then all initial crosslink sites that are closer to each other than 8 nucleotides (bsSize -1) will be concatenated. Any of these ranges with less than 3 nucleotides of width will be removed, which reflects about 1/3 of the desired binding site width.

The argument minCrosslinks defines how many positions of the binding sites are covered with at least one crosslink event. This threshold has to be defined in conjunction with the binding site width. A default value of 3 with a binding site width of 9 means that 1/3 of all positions in the final binding site must be covered by a crosslink event. Setting this filter to 0 deactivates it.

The minClSites argument defines how many positions of the binding site must have been covered by the original crosslink site input. If the input was based on the single nucleotide crosslink positions computed by PureCLIP than this filter checks for the number of positions originally identified by PureCLIP in the computed binding sites. The default of minClSites = 1 essentially deactivates this filter. Setting this filter to 0 deactivates it.

The options centerIsClSite and centerIsSummit ensure that the center of each binding site is covered by an initial crosslink site and represents the summit of crosslink events in the binding site, respectively.

The option sub.chr allows to run the binding site merging on a smaller subset (eg. "chr1") for improoved computational speed when testing the effect of various binding site width and filtering options.

Value

an object of type BSFDataSet with modified ranges

See Also

BSFDataSet, BSFind, mergeCrosslinkDiagnosticsPlot, makeBsSummaryPlot

Examples


# load data
files <- system.file("extdata", package="BindingSiteFinder")
load(list.files(files, pattern = ".rda$", full.names = TRUE))

# standard options, no subsetting
bds <- makeBindingSites(object = bds, bsSize = 9, minWidth = 2,
minCrosslinks = 2, minClSites = 1)

# standard options, with subsetting
bds <- makeBindingSites(object = bds, bsSize = 9, minWidth = 2,
minCrosslinks = 2, minClSites = 1, sub.chr = "chr22")


ZarnackGroup/BindingSiteFinder documentation built on May 2, 2024, 12:38 a.m.