collapseBins: Collapse consecutive bins

View source: R/collapseBins.R

collapseBinsR Documentation

Collapse consecutive bins

Description

The function will collapse consecutive bins which have, for example, the same combinatorial state.

Usage

collapseBins(data, column2collapseBy = NULL, columns2sumUp = NULL,
  columns2average = NULL, columns2getMax = NULL, columns2drop = NULL)

Arguments

data

A data.frame containing the genomic coordinates in the first three columns.

column2collapseBy

The number of the column which will be used to collapse all other inputs. If a set of consecutive bins has the same value in this column, they will be aggregated into one bin with adjusted genomic coordinates. If NULL directly adjacent bins will be collapsed.

columns2sumUp

Column numbers that will be summed during the aggregation process.

columns2average

Column numbers that will be averaged during the aggregation process.

columns2getMax

Column numbers where the maximum will be chosen during the aggregation process.

columns2drop

Column numbers that will be dropped after the aggregation process.

Details

The following tables illustrate the principle of the collapsing:

Input data:

seqnames start end column2collapseBy moreColumns columns2sumUp
chr1 0 199 2 1 10 1 3
chr1 200 399 2 2 11 0 3
chr1 400 599 2 3 12 1 3
chr1 600 799 1 4 13 0 3
chr1 800 999 1 5 14 1 3

Output data:

seqnames start end column2collapseBy moreColumns columns2sumUp
chr1 0 599 2 1 10 2 9
chr1 600 999 1 4 13 1 6

Value

A data.frame.

Author(s)

Aaron Taudt

Examples

## Get an example BED file with single-cell-sequencing reads
bedfile <- system.file("extdata", "KK150311_VI_07.bam.bed.gz", package="AneuFinderData")
## Bin the BAM file into bin size 1Mp
binned <- binReads(bedfile, assembly='mm10', binsize=1e6,
                  chromosomes=c(1:19,'X','Y'))
## Collapse the bins by chromosome and get average, summed and maximum read count
df <- as.data.frame(binned[[1]])
# Remove one bin for illustration purposes
df <- df[-3,]
head(df)
collapseBins(df, column2collapseBy='seqnames', columns2sumUp=c('width','counts'),
                       columns2average='counts', columns2getMax='counts',
                       columns2drop=c('mcounts','pcounts'))
collapseBins(df, column2collapseBy=NULL, columns2sumUp=c('width','counts'),
                       columns2average='counts', columns2getMax='counts',
                       columns2drop=c('mcounts','pcounts'))


ataudt/aneufinder documentation built on April 18, 2023, 4:20 a.m.