Description Usage Arguments Details Value Author(s) References See Also Examples
Function to run the Expectation Maximization Algorithm for inference of model parameters: cellular prevalence, normal proportion, tumour ploidy. This is a key function in the TitanCNA package and is the most computationally intense. This function makes calls to a C subroutine that allows the algorithm to be run more efficiently.
| 1 2 3 4 5 6 | 
| data | 
 | 
| params | 
 | 
| txnExpLen | Influences prior probability of genotype transitions in the HMM. Smaller value have lower tendency to change state; however, too small and it produces underflow problems.  | 
| txnZstrength | Influences prior probability of clonal cluster transitions in the HMM. Smaller value have lower tendency to change clonal cluster state.  | 
| pseudoCounts | Small, machine precision values to add to probabilities to avoid underflow. For example,  | 
| maxiter | Maximum number of expectation-maximization iterations allowed. In practice, for TitanCNA, it will usually not exceed 20. | 
| maxiterUpdate | Maximum number of coordinate descent iterations during the M-step (of EM algorithm) when parameters are estimated. | 
| normalEstimateMethod | Specifies how to handle normal proportion estimation. Using  | 
| estimateS | Logical indicating whether to account for clonality and estimate subclonal events. See Details. | 
| estimatePloidy | Logical indicating whether to estimate and account for tumour ploidy. | 
| useOutlierState | Logical indicating whether an additional outlier state should be used. In practice, this is usually not necessary. | 
| likChangeThreshold | EM convergence criteria - stop EM when complete log likelihood changes less than the proportion specified by this argument. | 
| verbose | Set to FALSE to suppress program messages. | 
This function is implemented with the "foreach" package and therefore supports parallelization. See "doMC" or "doMPI" for some parallelization packages.
The forwards-backwards algorithm is used for the E-step in the EM algorithm. This is done using a call to a C subroutine for each chromosome. The maximization step uses maximum a posteriori (MAP) for estimation of parameters.
If the sample has absolutely no normal contamination, then assign nParams$n_0 <- 0 and use argument normalEstimateMethod="fixed".
estimateS should always be set to TRUE. If no subclonality is expected, then use loadDefaultParameters(numberClonalClusters=1). Using estimateS=FALSE and loadDefaultParameters(numberClonalClusters=0) is gives more or less the same results.
list with components for results returned from the EM algorithm, including converged parameters, posterior marginal responsibilities, log likelihood, and original parameter settings.
| n | Converged estimate for normal contamination parameter.  | 
| s | Converged estimate(s) for cellular prevalence parameter(s). This value is defined as the proportion of tumour sample that does not contain the aberrant genotype. This will contrast what is output in  | 
| var | Converged estimates for variance parameter of the Gaussian mixtures used to model the log ratio data.  | 
| phi | Converged estimate for tumour ploidy parameter.  | 
| piG | Converged estimate for initial genotype state distribution.  | 
| piZ | Converged estimate for initial clonal cluster state distribution.  | 
| muR | Mean of binomial mixtures computed as a function of  | 
| muC | Mean of Gaussian mixtures computed as a function of  | 
| loglik | Posterior Log-likelihood that includes data likelihood and the priors.  | 
| rhoG | Posterior marginal probabilities for the genotype states computed during the E-step. Only the final iteration is returned as a  | 
| rhoZ | Posterior marginal probabilities for the clonal cluster states computed during the E-step. Only the final iteration is returned as a  | 
| genotypeParams | Original genotype parameters. See  | 
| ploidyParams | Original tumour ploidy parameters. See  | 
| normalParams | Original normal contamination parameters. See  | 
| clonalParams | Original subclonal parameters. See  | 
| txnExpLen | Original genotype transition expected length. See  | 
| txnZstrength | Original clonal cluster transition expected length. See  | 
| useOutlierState | Original setting indicating usage of outlier state. See  | 
Gavin Ha <gavinha@gmail.com>
Ha, G., Roth, A., Khattra, J., Ho, J., Yap, D., Prentice, L. M., Melnyk, N., McPherson, A., Bashashati, A., Laks, E., Biele, J., Ding, J., Le, A., Rosner, J., Shumansky, K., Marra, M. A., Huntsman, D. G., McAlpine, J. N., Aparicio, S. A. J. R., and Shah, S. P. (2014). TITAN: Inference of copy number architectures in clonal cell populations from tumour whole genome sequence data. Genome Research, 24: 1881-1893. (PMID: 25060187)
"foreach", "doMC", "doMPI", loadAlleleCounts, loadDefaultParameters, viterbiClonalCN
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | message('Running TITAN ...')
#### LOAD DATA ####
infile <- system.file("extdata", "test_alleleCounts_chr2.txt", 
              package = "TitanCNA")
data <- loadAlleleCounts(infile)
#### LOAD PARAMETERS ####
message('titan: Loading default parameters')
numClusters <- 2
params <- loadDefaultParameters(copyNumber = 5, 
              numberClonalClusters = numClusters, skew = 0.1)
#### READ COPY NUMBER FROM HMMCOPY FILE ####
message('titan: Correcting GC content and mappability biases...')
tumWig <- system.file("extdata", "test_tum_chr2.wig", package = "TitanCNA")
normWig <- system.file("extdata", "test_norm_chr2.wig", package = "TitanCNA")
gc <- system.file("extdata", "gc_chr2.wig", package = "TitanCNA")
map <- system.file("extdata", "map_chr2.wig", package = "TitanCNA")
cnData <- correctReadDepth(tumWig, normWig, gc, map)
logR <- getPositionOverlap(data$chr, data$posn, cnData)
data$logR <- log(2^logR) #transform to natural log
#### FILTER DATA FOR DEPTH, MAPPABILITY, NA, etc ####
data <- filterData(data, 1:24, minDepth = 10, maxDepth = 200, map = NULL)
#### EM (FWD-BACK) TO TRAIN PARAMETERS ####
#### Can use parallelization packages ####
K <- length(params$genotypeParams$alphaKHyper)
params$genotypeParams$alphaKHyper <- rep(500, K)
params$ploidyParams$phi_0 <- 1.5
convergeParams <- runEMclonalCN(data, params,
                                maxiter = 3, maxiterUpdate = 500, 
                                txnExpLen = 1e15, txnZstrength = 5e5, 
                                useOutlierState = FALSE, 
                                normalEstimateMethod = "map", 
                                estimateS = TRUE, estimatePloidy = TRUE)
 | 
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.