kolmogorov_batches: Group provisional batch labels by similarity of eCDFs

Description Usage Arguments See Also Examples

Description

This function groups provisional batches (e.g., chemistry plate, date, or DNA source) by similarity of the empirical cummulative distribution function (eCDF). Similarity of the eCDFs is based on the p-value from Kolmogorov-Smirnov test statistic. All pairwise combinations of batches are compared recursively until no two batches can be combined.

Usage

1
kolmogorov_batches(dat, KS_cutoff)

Arguments

dat

typically a 'tibble' gotten by 'assays(MultiBatch)'

KS_cutoff

scalar indicating cutoff for Kolmogorov-Smirnov p-value. Two provisional batches with p-value above this cutoff are combined.

See Also

ks.test

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
extdir <- system.file("extdata", package="CNPBayes")
se <- readRDS(file.path(extdir, "snp_se.rds"))
cnv_region <- GRanges("chr2", IRanges(90010895, 90248037),
                      seqinfo=seqinfo(se))
se2 <- subsetByOverlaps(se, cnv_region)
provisional_batch <- se2$Sample.Plate
full.data <- median_summary(se2,
                            provisional_batch=provisional_batch,
                            assay_index=2,
                            THR=-1)
## Not run: 
   batched.data <- kolmogorov_batches(full.data, 1e-6)

## End(Not run)

scristia/CNPBayes documentation built on Aug. 9, 2020, 7:31 p.m.