estimateBACsizeFromVDV: Estimate the size of a BAC using the length of all VDV reads

Description Usage Arguments Details Value Author(s) Examples

View source: R/estimateBACsizeFromVDV.R

Description

Cluster the sizes of VDV read for different number of clusters (use LongestDNA rather than ReadLength, see details). For each clustering, obtain the median size for each cluster and take the largest one. The estimated BAC size is the median of these larger medians

Usage

1
estimateBACsizeFromVDV(vdvLength, nclust = 2:10, method = c("jenks", "pam"))

Arguments

vdvLength

integer vector of lengths of VDV reads

nclust

integer vector of numbers of clusters to use (default to 2:10)

method

character string indicating which method to use ("jenks" or "pam"). Default is "jenks" (faster)

Details

In order to use the result of this function with the FilterBACreads function you should input the "LongestDNA" values instead of the ReadLength values for VDV reads.

Value

tibble

Author(s)

Pascal GP Martin

Examples

1
2
3
4
5
6
7
## Generate some random insert sizes
set.seed(12345)
InsertLengths <- c(sample(102e3:106e3, 30),
                   sample(43e3:45e3, 10),
                   sample(23e3:25e3, 10))
## Estimate the BAC size (smaller lengths are considered likely recombinants)
estimateBACsizeFromVDV(InsertLengths, method = "jenks")$BACsize

pgpmartin/NanoBAC documentation built on Dec. 11, 2020, 9:51 a.m.