down_sample2: Downsamples the data to reduce computation

Description Usage Arguments Details Value See Also

Description

Downsampling observations within each batch.

Usage

1
down_sample2(dat, min_size = 300)

Arguments

dat

a 'tibble' containing the one-dimensional summaries for each sample and the batch labels

min_size

integer indicating the number of samples to randomly select for each batch. The actual number of samples included may be larger as samples flagged as likely deleted are not down-sampled.

Details

Downsampling is performed to reduce computation as typically 100-300 observations is sufficient to approximate the multi-modal deletions/duplications of germline copy number events. Data points that have been marked as 'likely_deletion' are not down-sampled as these tend to be rare and are an important indication of the type of polymorphism.

Value

a down-sampled tibble

See Also

upsample2


scristia/CNPBayes documentation built on Aug. 9, 2020, 7:31 p.m.