cnvSimCounts: Generate simulated molecule counts

Description Usage Arguments Details See Also

View source: R/cnvSimCounts.R

Description

Generate simulated molecule counts

Usage

1
2
3
4
5
6
7
8
9
cnvSimCounts(
  totalMolecules = 10000000L,
  interval = cnvSimInterval(),
  subject = "simulatedSubject",
  variantWidth = 1L,
  CN = c(0, 0.5, 1, 1.5, 2),
  cnProb = c(0.00025, 0.00025, 0.999, 0.00025, 0.00025),
  seed = NULL
)

Arguments

totalMolecules

integer of length 1, the total number of molecules

interval

data.table interval object with 'captureProb' field, see details

subject

subject name/identifier

variantWidth

integer, gives the possible variant widths in contiguous intervals, see details

CN

numeric vector of possible copy numbers; 1.0 indicates diploid state, see details

cnProb

numeric vector of probabilities corresponding to the possible copy states in 'CN'

seed

integer, passed to set.seed()

Details

cnvSimCounts requires an interval object with an added field, 'captureProb' defining the multinomial probability distribution for interval coverage. In this multinomial distribution, a success at an interval indicates the interval was covered by a sequencing molecule.

cnvSimCounts will simulate variable-width copy number variants, with the possible widths (number of contiguous intervals) given by the 'variantWidth' parameter. All variant widths are simulated at an equal probability.

The 'CN' parameter defines the possible copy states. To simply the computations, the mcCNV package defines 1.0 as the diploid state. The "actual" copies are given by multiplying 'CN' by 2. As such, all entries in the 'CN' parameter must be a multiple of 0.5.

We adjust the capture probabilities by multiplying the probability by the simulated copy number. For example, when the copy number is 1 (the diploid state), we do not wish to adjust the probability. However, if say 3 copies of the interval are present, the probability of capturing that interval is increased by 1.5.

We have found, likely due to sequencing and mapping errors, even true homozygous deletions can have a few reads. We account for this by using the multiplier 0.001 for intervals with complete deletions (copy number is 0.0).

See Also

cnvSimCounts cnvSimPool


daynefiler/mcCNV documentation built on Dec. 15, 2021, 3:58 a.m.