segmentBins: Segments normalized copy number data

segmentBinsR Documentation

Segments normalized copy number data

Description

Segments normalized copy number data.

Usage

segmentBins(object, smoothBy=FALSE, alpha=1e-10, undo.splits="sdundo", undo.SD=1,
  force=FALSE, transformFun="log2", ...)

Arguments

object

An object of class QDNAseqCopyNumbers.

smoothBy

An optional integer value to perform smoothing before segmentation by taking the mean of every smoothBy bins, and then segment those means. Default (FALSE) is to perform no smoothing. smoothBy=1L is a special case that will not perform smoothing, but will split the segmentation process by chromosome instead of by sample.

alpha

Significance levels for the test to accept change-points. Default is 1e-10.

undo.splits

A character string specifying how change-points are to be undone, if at all. Default is "sdundo", which undoes splits that are not at least this many SDs apart. Other choices are "prune", which uses a sum of squares criterion, and "none".

undo.SD

The number of SDs between means to keep a split if undo.splits="sdundo". Default is 1.0.

force

Whether to force execution when it causes removal of downstream calling results.

transformFun

A function to transform the data with. This can be the default "log2" for log2(x + .Machine$double.xmin), "sqrt" for the Anscombe transform of sqrt(x * 3/8) which stabilizes the variance, "none" for no transformation, or any R function that performs the desired transformation and also its inverse when called with parameter inv=TRUE.

...

Additional arguments passed to segment.

Value

Returns an object of class QDNAseqCopyNumbers with segmentation results added.

Numerical reproducibility

This method make use of random number generation (RNG) via the segment used internally. Because of this, calling the method with the same input data multiple times will each time give slightly different results. To get numerically reproducible results, the random seed must be fixed, e.g. by using 'set.seed()' at the top of the script.

Parallel processing

This function uses future to segment samples in parallel.

Author(s)

Ilari Scheinin

References

[1] A.B. Olshen, E.S. Venkatraman (aka Venkatraman E. Seshan), R. Lucito and M. Wigler, Circular binary segmentation for the analysis of array-based DNA copy number data, Biostatistics, 2004
[2] E.S. Venkatraman and A.B. Olshen, A faster circular binary segmentation algorithm for the analysis of array CGH data, Bioinformatics, 2007

See Also

Internally, segment of the DNAcopy package, which implements the CBS method [1,2], is used to segment the data.

Examples

 data(LGG150)
 readCounts <- LGG150
 readCountsFiltered <- applyFilters(readCounts)
 readCountsFiltered <- estimateCorrection(readCountsFiltered)
 copyNumbers <- correctBins(readCountsFiltered)
 copyNumbersNormalized <- normalizeBins(copyNumbers)
 copyNumbersSmooth <- smoothOutlierBins(copyNumbersNormalized)
 copyNumbersSegmented <- segmentBins(copyNumbersSmooth)
 

ccagc/QDNAseq documentation built on Feb. 2, 2023, 12:56 p.m.