Fit a 6-state HMM to log R ratios and B allele frequencies estimated from SNP arrays

Description

This function is intended for estimating the integer copy number from germline or DNA of clonal origin using a 6-state HMM. The states are homozygous deletion, hemizygous deletion, diploid copy number, diploid region of homozygosity, single copy gain, and two+ copy gain. Because heterozygous markers are more informative for copy number than homozygous markers and regions of homozgosity are common in normal genomes, we currently computed a weighted average of the BAF emission matrix with a uniform 0,1 distribution by the probability that the marker is heterozygous, thereby downweighting the contribution of homozygous SNPs to the likelihood. In addition to making the detection of copy-neutral regions of homozgosity less likely, it also helps prevent confusing hemizygous deletions with copy neutral regions of homozygosity – the former would be driven mostly by the log R ratios. This is experimental and subject to change.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
hmm2(object, emission_param = EmissionParam(),
  transition_param = TransitionParam(), ...)

## S4 method for signature 'SnpArrayExperiment'
hmm2(object, emission_param = EmissionParam(),
  transition_param = TransitionParam(), ...)

## S4 method for signature 'oligoSnpSet'
hmm2(object, emission_param = EmissionParam(),
  transition_param = TransitionParam(), ...)

## S4 method for signature 'ArrayViews'
hmm2(object, emission_param = EmissionParam(),
  transition_param = TransitionParam(), tolerance = 2, verbose = FALSE,
  ...)

Arguments

object

A SnpArrayExperiment

emission_param

A EmissionParam object

transition_param

A TransitionParam object

...

currently ignored

tolerance

length-one numeric vector. When the difference in the log-likelihood of the Viterbi state path between successive models (updated by Baum Welch) is less than the tolerance, no additional model updates are performed.

verbose

logical. Whether to display messages indicating progress.

Details

The hmm2 method allows parallelization across samples using the foreach paradigm. Parallelization is automatic when enabled via packages such as snow/doSNOW.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
tp <- TransitionParam()
TransitionParam(taup=1e12)
data(snp_exp)
emission_param <- EmissionParam(temper=1/2)
fit <- hmm2(snp_exp, emission_param)
unlist(fit)
cnvSegs(fit)
## There is too little data to infer cnv reliably in this trivial example.
## To illustrate filtering options on the results, we select
## CNVs for which
## - the CNV call has a posterior probability of at least 0.5
## - the number of features is 2 or more
## - the HMM states are 1 (homozygous deletion) or 2 (hemizygous deletion)
fp <- FilterParam(probability=0.5, numberFeatures=2, state=c("1", "2"))
cnvSegs(fit, fp)
## for parallelization
## Not run: 
   library(snow)
   library(doSNOW)
   cl <- makeCluster(2, type = "SOCK")
   registerDoSNOW(cl)
   fit <- hmm2(snp_exp, emission_param)

## End(Not run)

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.