01-sims: Simulating Read-Counts and Working With Priors for DeepCNV

Description Usage Arguments Details Value Author(s) See Also Examples

Description

The DeepCNV class is used to fit a Bayesian model to targeted sequencing data from one or a few genes in order to draw inferences about possible copy number changes. It includes routines to simulate read-counts with known copy number state and known fraction of normal 'contaminating' cells.

Usage

1
2
3
simReads(nMut, nVar, nu, cnstate=cnSet, depth=100, sdepth=25, ...)
CNVariant(nu, S = cnSet, V = vtSet, M = 2)
cnvLikelihood(nu, K, N, V, S)

Arguments

nMut

integer; the number of somatic mutations in a gene

nVar

integer; the number of variant SNPs in a gene

nu

numeric between 0 and 1; the fr4action of normal cells in the sample

cnstate

the copy number state of the gene; must be one of "Deleted", "Normal", or "Gained", as enumerated in cnSetg

depth

integer; the average read depth at the gene

sdepth

integer; the stnadard deviation of the read depth across varaints within a gene

...

extra parameters to pass from simReads to CNVariant

S

The copy number state, as enumerated in cnSet

V

The variant type. Must be one of "Mutation" or "SNP" as enumerated in vtSet

M

the total number of replicate copies of a (allelic) gene. The default value of 2 corresponds to a gain of one copy

K

Number of variant reads

N

Number of total reads (both variant and refernce)

Details

The DeepCNV class is used to fit a Bayesian model to targeted sequencing data from one or a few genes in order to draw inferences about possible copy number changes. Basically, we assume that the observed data consists of a list of triples (K, N, V), one for each variant in a gene. Here K is the number of variant reads, N is the total number of reads, and V is the type of each variant (either a known SNP or a somatic mutation). We model (K, N) using a binomial distribution, where the 'success' parameter φ depends (in a deterministic way) on the unknown parameters of interest: the fraction ν of normal cells in the sample and the copy number state (Normal, Deleted, or Gained).

The functions cnvLikelihood and CNVariant are used to compute the log-likelihood of the unknown parameters given the observed data. CNVariant computes the success parameter φ as a function of the observed data (K, N, V), and this parameter is then used to compute the binomial log-likelihood.

The simReads function generates simulated read-count data based on the underlying theoretical binomial model. More details can be found in the vignettes d01-cnvTheory and d02-oneGeneSims.

Value

The simReads function returns a data frame suitable for use by the function makeCNVPosterior.

The CNVariant function returns a real number between zero and one, corresponding to the fraction of reads that are expected to be variants given the variant type (V), the copy number state (S), and the fraction of normal cells (nu).

The cnvLikelihood function returns a real number representing the log-likelihood (yes, I know; it probably should be renamed) of the parameters (S, ν) given the observed data (K, N).

Author(s)

Kevin R. Coombes krc@silicovore.com

See Also

CNVPrior, CNVPosterior.

Examples

1
2
3
4
5
6
7
simReads(nMut=2, nVar=7, nu=0.17, "Norm", depth=130 )
# check log-likelihhoods of different copy number states
# for the same observed data
obs <- data.frame(K=c(69, 48), N=c(153, 167))
cnvLikelihood(0.22, obs)
cnvLikelihood(0.22, obs)
cnvLikelihood(0.22, obs)

DeepCNV documentation built on May 2, 2019, 5:23 p.m.

Related to 01-sims in DeepCNV...