binReads: Convert aligned reads from various file formats into read...
In AneuFinder: Analysis of Copy Number Variation in Single-Cell-Sequencing Data

Description Usage Arguments Details Value See Also Examples

Convert aligned reads in .bam or .bed(.gz) format into read counts in equidistant windows.

binReads(file, assembly, ID = basename(file), bamindex = file,
  chromosomes = NULL, pairedEndReads = FALSE, min.mapq = 10,
  remove.duplicate.reads = TRUE, max.fragment.width = 1000,
  blacklist = NULL, outputfolder.binned = "binned_data", binsizes = 1e+06,
  stepsizes = NULL, reads.per.bin = NULL, reads.per.step = NULL,
  bins = NULL, variable.width.reference = NULL, save.as.RData = FALSE,
  calc.complexity = TRUE, call = match.call(), reads.store = FALSE,
  outputfolder.reads = "data", reads.return = FALSE,
  reads.overwrite = FALSE, reads.only = FALSE, use.bamsignals = FALSE)

`file`	A file with aligned reads. Alternatively a `GRanges-class` with aligned reads.
`assembly`	Please see `fetchExtendedChromInfoFromUCSC` for available assemblies. Only necessary when importing BED files. BAM files are handled automatically. Alternatively a data.frame with columns 'chromosome' and 'length'.
`ID`	An identifier that will be used to identify the file throughout the workflow and in plotting.
`bamindex`	BAM index file. Can be specified without the .bai ending. If the index file does not exist it will be created and a warning is issued.
`chromosomes`	If only a subset of the chromosomes should be binned, specify them here.
`pairedEndReads`	Set to `TRUE` if you have paired-end reads in your BAM files (not implemented for BED files).
`min.mapq`	Minimum mapping quality when importing from BAM files. Set `min.mapq=NA` to keep all reads.
`remove.duplicate.reads`	A logical indicating whether or not duplicate reads should be removed.
`max.fragment.width`	Maximum allowed fragment length. This is to filter out erroneously wrong fragments due to mapping errors of paired end reads.
`blacklist`	A `GRanges-class` or a bed(.gz) file with blacklisted regions. Reads falling into those regions will be discarded.
`outputfolder.binned`	Folder to which the binned data will be saved. If the specified folder does not exist, it will be created.
`binsizes`	An integer vector with bin sizes. If more than one value is given, output files will be produced for each bin size.
`stepsizes`	A vector of step sizes the same length as `binsizes`. Only used for `method="HMM"`.
`reads.per.bin`	Approximate number of desired reads per bin. The bin size will be selected accordingly. Output files are produced for each value.
`reads.per.step`	Approximate number of desired reads per step.
`bins`	A named `list` with `GRanges-class` containing precalculated bins produced by `fixedWidthBins` or `variableWidthBins`. Names must correspond to the binsize.
`variable.width.reference`	A BAM file that is used as reference to produce variable width bins. See `variableWidthBins` for details.
`save.as.RData`	If set to `FALSE`, no output file will be written. Instead, a GenomicRanges object containing the binned data will be returned. Only the first binsize will be processed in this case.
`calc.complexity`	A logical indicating whether or not to estimate library complexity.
`call`	The `match.call()` of the parent function.
`reads.store`	If `TRUE` processed read fragments will be saved to file. Reads are processed according to `min.mapq` and `remove.duplicate.reads`. Paired end reads are coerced to single end fragments. Will be ignored if `use.bamsignals=TRUE`.
`outputfolder.reads`	Folder to which the read fragments will be saved. If the specified folder does not exist, it will be created.
`reads.return`	If `TRUE` no binning is done and instead, read fragments from the input file are returned in `GRanges-class` format.
`reads.overwrite`	Whether or not an existing file with read fragments should be overwritten.
`reads.only`	If `TRUE` only read fragments are stored and/or returned and no binning is done.
`use.bamsignals`	If `TRUE` the bamsignals package will be used for binning. This gives a tremendous performance increase for the binning step. `reads.store` and `calc.complexity` will be set to `FALSE` in this case.

Convert aligned reads from .bam or .bed(.gz) files into read counts in equidistant windows (bins). This function uses GenomicRanges::countOverlaps to calculate the read counts.

The function produces a list() of GRanges-class or GRangesList objects with meta data columns 'counts', 'mcounts', 'pcounts' that contain the total, minus and plus read count. This binned data will be either written to file (save.as.RData=FALSE) or given as return value (save.as.RData=FALSE).

binning

## Get an example BED file with single-cell-sequencing reads
bedfile <- system.file("extdata", "KK150311_VI_07.bam.bed.gz", package="AneuFinderData")
## Bin the BED file into bin size 1Mb
binned <- binReads(bedfile, assembly='mm10', binsize=1e6,
                  chromosomes=c(1:19,'X','Y'))
print(binned)