findCNVs.strandseq: Find copy number variations (strandseq)

View source: R/findCNVs.R

findCNVs.strandseqR Documentation

Find copy number variations (strandseq)


findCNVs.strandseq classifies the binned read counts into several states which represent copy-numbers on each strand.


findCNVs.strandseq(, ID = NULL, R = 10, sig.lvl = 0.1,
  eps = 0.01, init = "standard", max.time = -1, max.iter = 1000,
  num.trials = 5, eps.try = max(10 * eps, 1), num.threads = 1,
  count.cutoff.quantile = 0.999, strand = "*",
  states = c("zero-inflation", paste0(0:10, "-somy")),
  most.frequent.state = "1-somy", method = "edivisive", algorithm = "EM",
  initial.params = NULL)


A GRanges-class object with binned read counts.


An identifier that will be used to identify this sample in various downstream functions. Could be the file name of the for example.


method-edivisive: The maximum number of random permutations to use in each iteration of the permutation test (see e.divisive). Increase this value to increase accuracy on the cost of speed.


method-edivisive: The level at which to sequentially test if a proposed change point is statistically significant (see e.divisive). Increase this value to find more breakpoints.


method-HMM: Convergence threshold for the Baum-Welch algorithm.


method-HMM: One of the following initialization procedures:


The negative binomial of state '2-somy' will be initialized with mean=mean(counts), var=var(counts). This procedure usually gives good convergence.


Mean and variance of the negative binomial of state '2-somy' will be initialized with random values (in certain boundaries, see source code). Try this if the standard procedure fails to produce a good fit.


method-HMM: The maximum running time in seconds for the Baum-Welch algorithm. If this time is reached, the Baum-Welch will terminate after the current iteration finishes. Set max.time = -1 for no limit.


method-HMM: The maximum number of iterations for the Baum-Welch algorithm. Set max.iter = -1 for no limit.


method-HMM: The number of trials to find a fit where state most.frequent.state is most frequent. Each time, the HMM is seeded with different random initial values.


method-HMM: If code num.trials is set to greater than 1, eps.try is used for the trial runs. If unset, eps is used.


method-HMM: Number of threads to use. Setting this to >1 may give increased performance.


method-HMM: A quantile between 0 and 1. Should be near 1. Read counts above this quantile will be set to the read count specified by this quantile. Filtering very high read counts increases the performance of the Baum-Welch fitting procedure. However, if your data contains very few peaks they might be filtered out. Set count.cutoff.quantile=1 in this case.


Find copy-numbers only for the specified strand. One of c('+', '-', '*').


method-HMM: A subset or all of c("zero-inflation","0-somy","1-somy","2-somy","3-somy","4-somy",...). This vector defines the states that are used in the Hidden Markov Model. The order of the entries must not be changed.


method-HMM: One of the states that were given in states. The specified state is assumed to be the most frequent one. This can help the fitting procedure to converge into the correct fit.


Any combination of c('HMM','dnacopy','edivisive'). Option method='HMM' uses a Hidden Markov Model as described in doi:10.1186/s13059-016-0971-7 to call copy numbers. Option 'dnacopy' uses segment from the DNAcopy package to call copy numbers similarly to the method proposed in doi:10.1038/nmeth.3578, which gives more robust but less sensitive results compared to the HMM. Option 'edivisive' (DEFAULT) works like option 'dnacopy' but uses the e.divisive function from the ecp package for segmentation.


method-HMM: One of c('baumWelch','EM'). The expectation maximization ('EM') will find the most likely states and fit the best parameters to the data, the 'baumWelch' will find the most likely states using the initial parameters.


method-HMM: A aneuHMM object or file containing such an object from which initial starting parameters will be extracted.


An aneuBiHMM object.


Aaron Taudt


## Get an example BED file with single-cell-sequencing reads
bedfile <- system.file("extdata", "KK150311_VI_07.bam.bed.gz", package="AneuFinderData")
## Bin the file into bin size 1Mp
binned <- binReads(bedfile, assembly='mm10', binsize=1e6,
                  chromosomes=c(1:19,'X','Y'), pairedEndReads=TRUE)
## Find copy-numbers
model <- findCNVs.strandseq(binned[[1]])
## Check the fit
plot(model, type='histogram')
plot(model, type='profile')

ataudt/aneufinder documentation built on April 18, 2023, 4:20 a.m.