binReads: Convert aligned reads from various file formats into read...

Description Usage Arguments Details Value Examples

View source: R/binReads.R

Description

Convert aligned reads in .bam or .bed(.gz) format into read counts in equidistant windows.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
binReads(
  file,
  experiment.table = NULL,
  ID = NULL,
  assembly,
  bamindex = file,
  chromosomes = NULL,
  pairedEndReads = FALSE,
  min.mapq = 10,
  remove.duplicate.reads = TRUE,
  max.fragment.width = 1000,
  blacklist = NULL,
  binsizes = 1000,
  stepsizes = binsizes/2,
  reads.per.bin = NULL,
  bins = NULL,
  variable.width.reference = NULL,
  use.bamsignals = TRUE,
  format = NULL
)

Arguments

file

A file with aligned reads. Alternatively a GRanges-class with aligned reads.

experiment.table

An experiment.table containing the supplied file. This is necessary to uniquely identify the file in later steps of the workflow. Set to NULL if you don't have it (not recommended).

ID

Optional ID to select a row from the experiment.table. Only necessary if the experiment table contains the same file in multiple positions in column 'file'.

assembly

Please see getChromInfoFromUCSC for available assemblies. Only necessary when importing BED files. BAM files are handled automatically. Alternatively a data.frame with columns 'chromosome' and 'length'.

bamindex

BAM index file. Can be specified without the .bai ending. If the index file does not exist it will be created and a warning is issued.

chromosomes

If only a subset of the chromosomes should be binned, specify them here.

pairedEndReads

Set to TRUE if you have paired-end reads in your BAM files (not implemented for BED files).

min.mapq

Minimum mapping quality when importing from BAM files. Set min.mapq=0 to keep all reads.

remove.duplicate.reads

A logical indicating whether or not duplicate reads should be removed.

max.fragment.width

Maximum allowed fragment length. This is to filter out erroneously wrong fragments due to mapping errors of paired end reads.

blacklist

A GRanges-class or a bed(.gz) file with blacklisted regions. Reads falling into those regions will be discarded.

binsizes

An integer vector specifying the bin sizes to use.

stepsizes

An integer vector specifying the step size. One number can be given for each element in binsizes, reads.per.bin and bins (in that order).

reads.per.bin

Approximate number of desired reads per bin. The bin size will be selected accordingly.

bins

A GRanges-class or a named list() with GRanges-class containing precalculated bins produced by fixedWidthBins or variableWidthBins. Names of the list must correspond to the binsize. If the list is unnamed, an attempt is made to automatically determine the binsize.

variable.width.reference

A BAM file that is used as reference to produce variable width bins. See variableWidthBins for details.

use.bamsignals

If TRUE the bamsignals package is used for parsing of BAM files. This gives tremendous speed advantage for only one binsize but linearly increases for multiple binsizes, while use.bamsignals=FALSE has a binsize dependent runtime and might be faster if many binsizes are calculated.

format

One of c('bed','bam','GRanges',NULL). With NULL the format is determined automatically from the file ending.

Details

Convert aligned reads from .bam or .bed(.gz) files into read counts in equidistant windows (bins). This function uses GenomicRanges::countOverlaps to calculate the read counts, or alternatively bamsignals::bamProfile if option use.bamsignals is set (only effective for .bam files).

Value

If only one bin size was specified for option binsizes, the function returns a single GRanges-class object with meta data column 'counts' that contains the read count. If multiple binsizes were specified , the function returns a named list() of GRanges-class objects.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
## Get an example BAM file with ChIP-seq reads
file <- system.file("extdata", "euratrans",
                   "lv-H3K27me3-BN-male-bio2-tech1.bam",
                    package="chromstaRData")
## Bin the file into bin size 1000bp
data(rn4_chrominfo)
data(experiment_table)
binned <- binReads(file, experiment.table=experiment_table,
                  assembly=rn4_chrominfo, binsizes=1000,
                  stepsizes=500, chromosomes='chr12')
print(binned)

ataudt/chromstaR documentation built on Dec. 26, 2021, 12:07 a.m.