findSurrogates-MultiBatch-method: Estimate batch from any sample-level surrogate variables that...

Description Usage Arguments Details Value

Description

In high-throughput assays, low-level summaries of copy number at copy number polymorphic loci (e.g., the mean log R ratio for each sample, or a principal-component derived summary) often differ between groups of samples due to technical sources of variation such as reagents, technician, or laboratory. Technical (as opposed to biological) differences between groups of samples are referred to as batch effects. A useful surrogate for batch is the chemistry plate on which the samples were hybridized. In large studies, a Bayesian hierarchical mixture model with plate-specific means and variances is computationally prohibitive. However, chemistry plates processed at similar times may be qualitatively similar in terms of the distribution of the copy number summary statistic. Further, we have observed that some copy number polymorphic loci exhibit very little evidence of a batch effect, while other loci are more prone to technical variation. We suggest combining plates that are qualitatively similar in terms of the Kolmogorov-Smirnov two-sample test of the distribution and to implement this test independently for each candidate copy number polymophism identified in a study. The collapseBatch function is a wrapper to the ks.test implemented in the stats package that compares all pairwise combinations of plates. The ks.test is performed recursively on the batch variables defined for a given CNP until no batches can be combined. For smaller values of THR, plates are more likely to be judged as similar and combined.

Usage

1
2
## S4 method for signature 'MultiBatch'
findSurrogates(object, THR = 0.1)

Arguments

object

a MultiBatch instance

THR

scalar for the p-value cutoff from the K-S test. Two batches with p-value > THR will be combined to define a single new batch

Details

All pairwise comparisons of batches are performed. The two most similar batches are combined if the p-value exceeds THR. The process is repeated recursively until no two batches can be combined.

Value

MultiBatch object


CNPBayes documentation built on May 6, 2019, 4:06 a.m.