Description Usage Arguments Details Value References
Perform tQN normalization of intensity data.
1 2 |
gty |
a |
thresholds |
thresholds for scaling of x- and y-intensities; defaults recommended in Staaf et al. (2008) |
clusters |
a pre-computed matrix of cluster means |
prenorm |
logical; if |
xynorm |
logical; if |
adjust.lrr |
logical; if |
... |
ignored |
Implements thresholded quantile normalization (tQN) as described in Staaf et al. (2008). Quantile
normalization as originally described in Bolstad et al. (2003) matches quantiles across multiple samples
so that all samples' intensities have the same empirical distribution. The tQN instead matches the quantiles
of the x- and y-intensities sample-wise in order to reduce noise in the B-allele frequency (BAF) calculation
proposed by Peiffer et al. (2006). The quantile-normalized intensities are then subject to a threshold to limit
on the ratio between the transformed and raw values.
NB: the quality of the result of tQN depends strongly on the reference clusters provided in clusters
,
so beware.
The object clusters
should be a dataframe with one row per marker and at least the following six columns:
A.R, A.T
, the values of R and theta, respectively, for the centroid of the AA homozygous cluster;
B.R, B.T
, likewise for the BB homozygous cluster; and H.R, H.T
, likewise for the AB heterozygous
cluster.
The transformations proposed by Peiffer et al. (2006) assume that most samples will fall into three well-defined
clusters at each marker, save for a relatively small proportion of abberrantly-hybridizing samples. Indeed the BAF
is only well-defined in this case. However, these assumptions are a bit too restrictive for many arrays, and
in particular for arrays which include copy-number probes. It may be possible to obtain a tighter distribution
of LRR values by choosing adjust.lrr = TRUE
and re-computing the LRR values against a reference distribution
independent of BAF or underlying clustering pattern. For this, an additional column Rmean
is required in
clusters
which gives the "mean" (or other appropriately-chosen central value) of R across *all* possible
clusters at this marker.
A copy of the input object, with raw intensities replaced by the normalized ones. Two additional attributes
baf
and lrr
store the BAF (B-allele frequency) and LRR (log2 intensity ratio).
Adapted from code provided by Johan Staaf to John Didion.
Staaf J et al. (2008) BMC Bioinformatics. doi:10.1186/1471-2105-9-409.
Bolstad BM et al. (2003) A comparison of normalization methods for high density oligonucleotide array data based on bias and variance. Bioinformatics 19(2): 185-193.
Peiffer DA et al. (2006) Genome Res 16(9): 1136-1148. doi:10.1101/gr.5402306.
Didion JP et al. (2014) BMC Genomics. doi:10.1186/1471-2164-15-847.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.