runMadSeq: Model to detect and quantify mosaic aneuploidy
In MADSEQ: Mosaic Aneuploidy Detection and Quantification using Massive Parallel Sequencing Data

Description Usage Arguments Value Note Author(s) References See Also Examples

Take in the heterozygous sites and coverage information, use different models (normal, monosomy, mitotic trisomy, meiotic trisomy, loss of heterozygosity) to fit the data, and select the model fit the data best according to BIC value and return estimation of the fraction of aneuploid cells.

1
2
3

runMadSeq(hetero, coverage, target_chr, adapt = 10000, burnin = 10000,
  nChain = 2, nStep = 10000, thinSteps = 2, checkConvergence = FALSE,
  plot = TRUE)

`hetero`	A `character` specify the location of processed heterozygous table returned by `prepareHetero` function, or A `GRanges` object returned by `prepareHetero` function
`coverage`	A `character` specify the location of normalized coverage table returned by `normalizeCoverage` function, or A `GRanges` object from the `GRangesList` returned by `normalizeCoverage` function. Look up your sample by `names(GRangesList)`, and subset your the normalized coverage for your sample by `GRangesList["sample_name"]`. For more details, please check the example.
`target_chr`	A `character` specify the chromosome number you want to detect. Note: Please check your assembly, use contig name "chr1" or "1" accordingly.
`adapt`	A `integer` indicate the adaption steps for the MCMC sampling. Default: 10000
`burnin`	A `integer` indicate burnin steps for the MCMC sampling. Default: 10000. If the posterior distribution is not converged, increasing burnin steps can be helpful.
`nChain`	A `integer` indicate the number of chains for the MCMC sampling. Default: 2. Note: More than 1 chain is required if `checkConvergence` is set to `TRUE`.
`nStep`	A `integer` indicate the number of steps to be recorded for the MCMC sampling. Default: 10000. Generally, the more steps you record, the more accurate the estimation is.
`thinSteps`	A `integer` indicate the number of steps to "thin" (thinSteps=1) means save everystep. Default: 2.
`checkConvergence`	A `Boolean` indicate whether to check the convergence of independent MCMC chains. If your data is not converged, you may increase adaption step and burnin step. Default: `FALSE`
`plot`	A `Boolean`. If `TRUE`, the alternative allele frequency (AAF) for each heterozygous site along the target chromosome will be plotted.

An S4 object of class MadSeq containing the posterior distribution for the selected model, and deltaBIC between five models.

1.If you didn't write normalized coverage into file, please subset the normalized coverage GRanges object from the GRangesList object returned from the normalizeCoverage funciton.
2. When specify target_chr, please make sure it consist with the contig names in your sequencing data, example: "chr1" and "1".
3. If checkConvergence set to TRUE, the nChain has to be >2
4. If it shows that your chains are not converged, helpful options are increasing the adapt and burnin steps.
5. Because the model is an MCMC sampling process, it can take a very long time to finish. Running in the background or HPC is recommended.

Yu Kong

Martyn Plummer (2016). rjags: Bayesian Graphical Models using MCMC. R package version 4-6.
https://CRAN.R-project.org/package=rjags

MadSeq, plotMadSeq, plotFraction, plotMixture

## ------------------------------------------------------------------------
## The following example is for the case that normalized coverage and 
## processed heterozygous sites have not been written to files. For more 
## examples, please check the documentation.
## ------------------------------------------------------------------------
##------Prepare Heterozygous Sites
## specify the path to the vcf.gz file for the aneuploidy sample
aneuploidy_vcf = system.file("extdata","aneuploidy.vcf.gz",package="MADSEQ")
## specify the path to the bed file containing targeted region
target = system.file("extdata","target.bed",package="MADSEQ")
## prepare heterozygous sites
aneuploidy_hetero = prepareHetero(aneuploidy_vcf,target, writeToFile=FALSE)

##------Prepare Normalized Coverage
## specify the path to the bam file
aneuploidy_bam = system.file("extdata","aneuploidy.bam",package="MADSEQ")
normal_bam = system.file("extdata","normal.bam",package="MADSEQ")

## prepare coverage data for the samples
aneuploidy_cov_gc = prepareCoverageGC(target,aneuploidy_bam,"hg19")
normal_cov_gc = prepareCoverageGC(target,normal_bam,"hg19")

## normalize the coverage
normed = normalizeCoverage(aneuploidy_cov_gc,
                           control=normal_cov_gc,writeToFile=FALSE)
                           
##------subset normalized coverage GRanges object
aneuploidy_normed_cov = normed[["aneuploidy_cov_gc"]]
## check chromosome18
## (to speed up the example, we only run one chain and less steps here, 
##  but default settings are recommended in real case)
aneuploidy_chr18 = runMadSeq(aneuploidy_hetero, aneuploidy_normed_cov,
                             target_chr="chr18", adapt=100, burnin=200,
                             nChain =1, nStep = 1000, thinSteps=1)