tnK.norm: Weighted targeted normalization using K-mean optimization.

Description Usage Arguments Value Author(s)

View source: R/tnK.norm.R

Description

Bin count targeted normalization using K-mean clustering to group bins with similar coverage profile across the samples. For outlier bins, i.e. showing unique profile, the set of most similar bins is computed for each bin. Eventually the importance of each sample in the definition of the coverage profile can be weighted using principal components. This normalization is still in development state. It is supposed to be more faster than the full targeted normalization ('tn.norm') and better normalize outlier samples (e.g. if the data is not homogeneous).

Usage

1
2
tnK.norm(bc.ref, samp, bc.to.norm = NULL, cont.sample, pca.weights = FALSE,
  max.size = 1000, plot = FALSE)

Arguments

bc.ref

a data.frame with the coverage in the reference samples.

samp

the name of the sample to normalize.

bc.to.norm

If non-null and the 'samp' is not in 'bc.ref', this data.frame is used. It should be a single sample coverage data.frame, i.e. with columns exactly: chr, start, end and bc.

cont.sample

the name of the control sample to normalize the coverage to.

pca.weights

should the samples be weighted using principal components.

max.size

the maximum size of a cluster of bins.

plot

should some graphs be outputed ? Default FALSE.

Value

a data.frame with the normalized bin counts. columns: chr, start, end, bc.

Author(s)

Jean Monlong


jmonlong/PopSV documentation built on Sept. 15, 2019, 9:29 p.m.