HMMcopy is a package for making bias-free copy number estimation by correcting for GC-content and mappability bias in HTS readcounts. It also contains an implementation of the Hidden Markov Model to robustly segment a copy number profile into non-overlapping segments predicted to be of the same copy number state, and attributes a biological copy number aberration events to the segments.
HMMcopy takes as input WIG format files generated by fast C++ tools distributed as part of the HMMcopy Suite, namely readcount, GC-content and mappability values for non-overlapping fixed width “bins” across the reference genome of interest. It then uses a filtering and LOESS model to correct the GC-content and mappability biases observed in the readcounts (Benjamini and Speed, 2012), and uses the corrected readcounts as a proxy of copy number. The resultant copy number profile is then segmented with a six state Hidden Markov Model, with a handful of quick visualization functions for quick viewing.
example("HMMcopy-package") for quick tour of functionality and
vignette("HMMcopy") for detailed example
Daniel Lai, Gavin Ha, Sohrab Shah
Maintainer: Daniel Lai <firstname.lastname@example.org> and Gavin Ha <email@example.com>
Yuval Benjamini and Terence P Speed. Summarizing and correcting the gc content bias in high-throughput sequencing. Nucleic Acids Res, 40(10):e72, May 2012.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
# Read WIG file input rfile <- system.file("extdata", "tumour.wig", package = "HMMcopy") gfile <- system.file("extdata", "gc.wig", package = "HMMcopy") mfile <- system.file("extdata", "map.wig", package = "HMMcopy") uncorrected_reads <- wigsToRangedData(rfile, gfile, mfile) # Correct reads into copy number corrected_copy <- correctReadcount(uncorrected_reads) # Segment copy number profile segmented_copy <- HMMsegment(corrected_copy) # Visualize one at a time par(ask = TRUE) plotBias(corrected_copy) plotCorrection(corrected_copy) plotSegments(corrected_copy, segmented_copy)