Description Usage Arguments Value Note Author(s) See Also Examples
This an internal function used by mainSeek
to accomplish three major tasks on a single chromosome: automatically select bin size, compute read counts within the bins, and obtain optimal HMM paramters.
1 2 3 4 | mainSeekSingleChrom(alignGR, K = 2, binSize = NULL, minReadCount = 10,
backupNumBins = 10, minBinSize = 200, maxBinSize = 1200,
increment = 5, pathToSavePlotsOfBinSizesVersusCosts,
verbose = TRUE, allowSecondAttempt = TRUE, ...)
|
alignGR |
GRanges containing the alignments on a single chromosome . |
K |
Number of hidden states (Default: 2). By default, state 1 specifies the background and state 2 the RIP regions. The two states are recognized by the means for the two distributions (See |
binSize |
Size to use for binning the read counts across each chromosome. If NULL, optimal bin size within a range (default: minBinSize=200, maxBinSize=1200) will be automatically selected (See |
minReadCount |
Minimum aligned read counts needed for HMM to converge (Default: 10). Note that HMM may not converge some times when majority of the read counts are zero even if some read count > 10. When that happens, a back-up function |
backupNumBins |
If read count is less than |
minBinSize |
Minimum bin size to start with the bin selection (See |
maxBinSize |
Maximum bin size to stop with the bin selection (See |
increment |
Step-wise increment in bin size selection (See |
pathToSavePlotsOfBinSizesVersusCosts |
Directory used to save the diagnostic plots for bin size selection. |
verbose |
Binary indicator for disable (FALSE) or enable (TRUE) HMM training message from function |
allowSecondAttempt |
In case HMM fails to converge due to malformed paramters in EM iteraction, re-iterating the HMM process each time with a different suboptimal bin size in attempt to succeed in some trial. If all yeild nothing, fall back up to |
... |
Argumnets passed to |
nbhGR |
GRanges object containing the optimized HMM parameters (and the Viterbi hidden state sequence) accompanied with the read count vector following the (automatic) binning scheme. |
Unless a highly customized workflow is needed, ripSeek
is the high-level front-end main function that should be used in most cases.
Yue Li
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 | # Retrieve system files
extdata.dir <- system.file("extdata", package="RIPSeeker")
bamFiles <- list.files(extdata.dir, ".bam$", recursive=TRUE, full.names=TRUE)
bamFiles <- grep("PRC2", bamFiles, value=TRUE)
# Parameters setting
binSize <- 1e5 # use a large fixed bin size for demo only
minBinSize <- NULL # min bin size in automatic bin size selection
maxBinSize <- NULL # max bin size in automatic bin size selection
multicore <- FALSE # use multicore
strandType <- "-" # set strand type to minus strand
# Retrieve system files
extdata.dir <- system.file("extdata", package="RIPSeeker")
bamFiles <- list.files(extdata.dir, ".bam$", recursive=TRUE, full.names=TRUE)
bamFiles <- grep("PRC2", bamFiles, value=TRUE)
alignGal <- getAlignGal(bamFiles[1], reverseComplement=TRUE, genomeBuild="mm9")
alignGR <- as(alignGal, "GRanges")
alignGRList <- GRangesList(as.list(split(alignGR, seqnames(alignGR))))
################ run main function for HMM inference on a single chromosome ################
nbhGR <- mainSeekSingleChrom(alignGR=alignGRList$chrX, K = 2, binSize=binSize,
minBinSize = minBinSize, maxBinSize = maxBinSize)
nbhGR
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.