Function to format TitanCNA results in to a data.frame and output the results to a tabdelimited file.
1 2 3 4 5 6 7 8 9 10 11  outputTitanResults(data, convergeParams, optimalPath, filename = NULL,
is.haplotypeData = FALSE, posteriorProbs = FALSE, subcloneProfiles = TRUE,
correctResults = TRUE, proportionThreshold = 0.05,
proportionThresholdClonal = 0.05, recomputeLogLik = TRUE, rerunViterbi = FALSE,
verbose = TRUE)
outputModelParameters(convergeParams, results, filename,
S_Dbw.scale = 1, S_Dbw.method = "Tong", S_Dbw.useCorrectedCN = TRUE)
outputTitanSegments(results, id, convergeParams, filename = NULL,
igvfilename = NULL)

id 
Character string identifier for sample 
data 

convergeParams 

optimalPath 

results 
Formatted TitanCNA results output from 
filename 
Path of the file to write the TitanCNA results. 
igvfilename 
Path of the file to write the IGV seg file. 
posteriorProbs 

is.haplotypeData 

subcloneProfiles 

correctResults 

recomputeLogLik 

rerunViterbi 

proportionThreshold 
Minimum proportion of the genome altered (by SNPs) for a cluster to be retained. Clonal clusters having lower proportion of alteration are removed. 
proportionThresholdClonal 
Minimum proportion of genome altered by clonal events (by SNPs) for the highest cellular prevalence cluster. If the highest prevalence cluster contains lower proportion of events than this threshold, this cluster will be removed and the next highest (subclonal) cluster will be readjusted to be the clonal cluster. 
S_Dbw.scale 
The S_Dbw validity index can be adjusted to account for differences between datasets. 
S_Dbw.method 
Compute S_Dbw validity index using 
S_Dbw.useCorrectedCN 

verbose 
Print status messages. 
outputModelParameters
outputs to a file with the estimated TITAN model parameters and model selection index. Each row contains information regarding different parameters:
1) Normal contamination estimate  proportion of normal content in the sample; tumour content is 1 minus this number
2) Average tumour ploidy estimate  average number of estimated copies in the genome; 2 represents diploid
3) Clonal cluster cellular prevalence  Z denotes the number of clonal clusters; each value (spacedelimited) following are the cellular prevalence estimates for each cluster. Cellular prevalence here is defined as the proportion of tumour sample that does contain the aberrant genotype.
4) Genotype binomial means for clonal cluster Z  set of 21 binomial estimated parameters for each specified cluster
5) Genotype Gaussian means for clonal cluster Z  set of 21 Gaussian estimated means for each specified cluster
6) Genotype Gaussian variance  set of 21 Gaussian estimated variances; variances are shared for across all clusters
7) Number of iterations  number of EM iterations needed for convergence
8) Log likelihood  complete data loglikelihood for current cluster run
9) S_Dbw dens.bw  density component of S_Dbw index; see computeSDbwIndex
10) S_Dbw scat  scatter component of S_Dbw index; see computeSDbwIndex
11) S_Dbw validity index  used for model selection where the run with optimal number of clusters based on lowest S_Dbw index. This value is slightly modified from that computed from computeSDbwIndex
. It is computed as S_Dbw= S_Dbw.scale * dens.bw + scat
12) S_Dbw dens.bw, scat, validity index is computed for LogRatio
and AllelicRatio
datatypes, as well as the combination of Both
. For Both
, the values are summed for both datatypes.
outputTitanResults
outputs a file that has the similar format described in ‘Value’ section.
outputTitanResults
also returns a list containing the following:
results 
TITAN results, uncorrected for cluster number and parameters 
corrResults 
TITAN results, corrected by removing empty clusters and parameters adjusted accordingly. 
convergeParams 
Corrected parameter object 
The results
and corrResults
are data.table objects, where each row corresponds to a position in the analysis, and with the following columns:
Chr 
character denoting chromosome number. ChrX and ChrY uses ‘X’ and ‘Y’. 
Position 
genomic coordinate 
RefCount 
number of reads matching the reference base 
NRefCount 
number of reads matching the nonreference base 
Depth 
total read depth at the position 
AllelicRatio 
RefCount/Depth 
LogRatio 
log2 ratio between normalized tumour and normal read depths 
CopyNumber 
predicted TitanCNA copy number 
TITANstate 
internal state number used by TitanCNA; see Reference 
TITANcall 
interpretable TitanCNA state; string (HOMD,DLOH,HET,NLOH,ALOH,ASCNA,BCNA,UBCNA); See Reference 
ClonalCluster 
predicted TitanCNA clonal cluster; lower cluster numbers represent clusters with higher cellular prevalence 
CellularPrevalence 
proportion of tumour cells containing event; not to be mistaken as proportion of sample (including normal) 
If subcloneProfiles
is set to TRUE
, then the subclone profiles will be appended to the output data.frame
.
Subclone1.CopyNumber 
Integer copy number for Subclone 1. 
Subclone1.TITANcall 
States for Subclone 1 
Subclone1.Prevalence 
The cellular prevalence of Subclone 1, or sometimes referred to as the subclone fraction. 
outputModelParameters
returns a list
containing the S_Dbw model selection:
dens.bw 

scat 

S_Dbw 
S_Dbw.scale * dens.bw + scat 
Gavin Ha <gavinha@gmail.com>
Ha, G., Roth, A., Khattra, J., Ho, J., Yap, D., Prentice, L. M., Melnyk, N., McPherson, A., Bashashati, A., Laks, E., Biele, J., Ding, J., Le, A., Rosner, J., Shumansky, K., Marra, M. A., Huntsman, D. G., McAlpine, J. N., Aparicio, S. A. J. R., and Shah, S. P. (2014). TITAN: Inference of copy number architectures in clonal cell populations from tumour whole genome sequence data. Genome Research, 24: 18811893. (PMID: 25060187)
runEMclonalCN
, viterbiClonalCN
, computeSDbwIndex
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27  data(EMresults)
#### COMPUTE OPTIMAL STATE PATH USING VITERBI ####
optimalPath < viterbiClonalCN(data, convergeParams)
#### FORMAT RESULTS ####
results < outputTitanResults(data, convergeParams, optimalPath,
filename = NULL, posteriorProbs = FALSE,
subcloneProfiles = TRUE, correctResults = TRUE,
proportionThreshold = 0.05, recomputeLogLik = FALSE,
proportionThresholdClonal = 0.05,
is.haplotypeData = FALSE)
## use corrected parameters
convergeParams < results$convergeParam
## use corrected results
results < results$corrResults
#### OUTPUT RESULTS TO FILE ####
outparam < paste0("cluster2_params.txt")
outputModelParameters(convergeParams, results, outparam)
#### OUTPUT SEGMENTS TO FILE ####
outseg < paste0("cluster2_segs.txt")
outigv < paste0("cluster2.seg")
segs < outputTitanSegments(results, id = "test", convergeParams,
filename = outseg, igvfilename = outigv)
# segment results also stored in data.frame "segs"

