estimateCorrection: Estimate correction to read counts for GC content and...

Description Usage Arguments Value Parallel processing Author(s) See Also Examples

Description

Estimate correction to read counts for GC content and mappability.

Usage

1
2
estimateCorrection(object, span=0.65, family="symmetric", adjustIncompletes=TRUE,
  maxIter=1, cutoff=4, variables=c("gc", "mappability"), ...)

Arguments

object

An QDNAseqReadCounts object with counts data.

span

For loess, the parameter alpha which controls the degree of smoothing.

family

For loess, if "gaussian" fitting is by least-squares, and if "symmetric" a re-descending M estimator is used with Tukey's biweight function.

adjustIncompletes

A boolean(1) specifying whether counts for bins with uncharacterized nucleotides (N's) in their reference genome sequence should be adjusted by dividing them with the percentage of characterized (A, C, G, T) nucleotides. Defaults to TRUE.

maxIter

An integer(1) specifying the maximum number of iterations to perform, default is 1. If larger, after the first loess fit, bins with median residuals larger than cutoff are removed, and the fitting repeated until the list of bins to use stabilizes or after maxIter iterations.

cutoff

A numeric(1) specifying the number of standard deviations (as estimated with madDiff) the cutoff for removal of bins with median residuals larger than the cutoff. Not used if maxIter=1 (default).

variables

A character vector specifying which variables to include in the correction. Can be c("gc", "mappability") (the default), "gc", or "mappability".

...

Additional arguments passed to loess.

Value

Returns a QDNAseqReadCounts object with the assay data element fit added.

Parallel processing

This function uses future to calculate the QDNAseq model across samples in parallel.

Author(s)

Ilari Scheinin

See Also

Internally, loess is used to fit the regression model.

Examples

1
2
3
4
data(LGG150)
readCounts <- LGG150
readCountsFiltered <- applyFilters(readCounts)
readCountsFiltered <- estimateCorrection(readCountsFiltered)

Example output

38,819	total bins
38,819	of which in selected chromosomes
36,722	of which with reference sequence
33,347	final bins
Calculating correction for GC content and mappability
    Calculating fit for sample LGG150 (1 of 1) ...
Done.

QDNAseq documentation built on Nov. 8, 2020, 6:57 p.m.