Parameters for computing emission probabilities for a 6-state HMM, including starting values for the mean and standard deviations for log R ratios (assumed to be Gaussian) and B allele frequencies (truncated Gaussian), and initial state probabilities.

Constructor for EmissionParam class

This function is exported primarily for internal use by other BioC packages.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | ```
cn_means(object)
cn_sds(object)
baf_means(object)
baf_sds(object)
baf_means(object) <- value
baf_sds(object) <- value
cn_sds(object) <- value
cn_means(object) <- value
EmissionParam(cn_means = CN_MEANS(), cn_sds = CN_SDS(),
baf_means = BAF_MEANS(), baf_sds = BAF_SDS(), initial = rep(1/6, 6),
EMupdates = 5L, CN_range = c(-5, 3), temper = 1, p_outlier = 1/100,
modelHomozygousRegions = FALSE)
EMupdates(object)
## S4 method for signature 'EmissionParam'
show(object)
``` |

`object` |
see |

`value` |
numeric vector |

`cn_means` |
numeric vector of starting values for log R ratio means (order is by copy number state) |

`cn_sds` |
numeric vector of starting values for log R ratio standard deviations (order is by copy number state) |

`baf_means` |
numeric vector of starting values for BAF means ordered. See example for details on how these are ordered. |

`baf_sds` |
numeric vector of starting values for BAF means ordered. See example for details on how these are ordered. |

`initial` |
numeric vector of intial state probabilities |

`EMupdates` |
number of EM updates |

`CN_range` |
the allowable range of log R ratios. Log R ratios outside this range are thresholded. |

`temper` |
Emission probabilities can be tempered by emit^temper. This is highly experimental. |

`p_outlier` |
probability that an observation is an outlier (assumed to be the same for all markers) |

`modelHomozygousRegions` |
logical. If FALSE (default), the emission probabilities for BAFs are modeled from a mixture of truncated normals and a Unif(0,1) where the mixture probabilities are given by the probability that the SNP is heterozygous. See Details below for a discussion of the implications. |

The log R ratios are assumed to be emitted from a normal distribution with a mean and standard deviation that depend on the latent copy number. Similarly, the BAFs are assumed to be emitted from a truncated normal distribution with a mean and standard deviation that depends on the latent number of B alleles relative to the total number of alleles (A+B).

numeric vector

When `modelHomozygousRegions`

is FALSE (the default in
versions >= 1.28.0), emission probabilities for B allele frequences
are calculated from a mixture of a truncated normal densities and a
Unif(0,1) density with the mixture probabilities given by the
probability that a SNP is homozygous. In particular, let `p`

denote a 6 dimensional vector of density estimates from a truncated
normal distribution for the latent genotypes 'A', 'B', 'AB', 'AAB',
'ABB', 'AAAB', and 'ABBB'. The probability that a genotype is
homozygous is estimated as

*prHom=(p["A"] + p["B"])/sum(p)*

and the probability that the genotype is heterozygous (any latent genotype that is not 'A' or 'B') is given by

*prHet = 1-prHom*

Since the density of a Unif(0,1) is 1, the 6-dimensional vector of emission probability at a SNP is given by

*emit = prHet * p + (1-prHet)*

The above has the effect of minimizing the influence of BAFs near 0 and 1 on the state path estimated by the Viterbi algorithm. In particular, the emission probability at homozygous SNPs will be virtually the same for states 3 and 4, but at heterozygous SNPs the emission probability for state 3 will be an order of magnitude greater for state 3 (diploid) compared to state 4 (diploid region of homozygosity). The advantage of this parameterization are fewer false positive hemizygous deletion calls. [ Log R ratios tend to be more sensitive to technical sources of variation than the corresponding BAFs/ genotypes. Regions in which the log R ratios are low due to technical sources of variation will be less likely to be interpreted as evidence of copy number loss if heterozygous genotypes have more 'weight' in the emission estimates than homozgous genotypes. ] The trade-off is that only states estimated by the HMM are those with copy number alterations. In particular, copy-neutral regions of homozygosity will not be called.

By setting `modelHomozygousRegions = TRUE`

, the emission
probabilities at a SNP are given simply by the `p`

vector
described above and copy-neutral regions of homozygosity will be
called.#'

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | ```
ep <- EmissionParam()
cn_means(ep)
ep <- EmissionParam()
cn_sds(ep)
ep <- EmissionParam()
baf_means(ep)
ep <- EmissionParam()
baf_sds(ep)
ep <- EmissionParam()
baf_means(ep) <- baf_means(ep)
ep <- EmissionParam()
baf_sds(ep) <- baf_sds(ep)
ep <- EmissionParam()
cn_sds(ep) <- cn_sds(ep)
ep <- EmissionParam()
cn_means(ep) <- cn_means(ep)
ep <- EmissionParam()
show(ep)
cn_means(ep)
cn_sds(ep)
baf_means(ep)
baf_sds(ep)
``` |

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.

All documentation is copyright its authors; we didn't write any of that.