findCNVs: Find copy number variations

View source: R/findCNVs.R

findCNVsR Documentation

Find copy number variations

Description

findCNVs classifies the binned read counts into several states which represent copy-numbers.

Usage

findCNVs(binned.data, ID = NULL, method = "edivisive", strand = "*",
  R = 10, sig.lvl = 0.1, eps = 0.01, init = "standard", max.time = -1,
  max.iter = 1000, num.trials = 15, eps.try = max(10 * eps, 1),
  num.threads = 1, count.cutoff.quantile = 0.999,
  states = c("zero-inflation", paste0(0:10, "-somy")),
  most.frequent.state = "2-somy", algorithm = "EM", initial.params = NULL,
  verbosity = 1)

Arguments

binned.data

A GRanges-class object with binned read counts.

ID

An identifier that will be used to identify this sample in various downstream functions. Could be the file name of the binned.data for example.

method

Any combination of c('HMM','dnacopy','edivisive'). Option method='HMM' uses a Hidden Markov Model as described in doi:10.1186/s13059-016-0971-7 to call copy numbers. Option 'dnacopy' uses segment from the DNAcopy package to call copy numbers similarly to the method proposed in doi:10.1038/nmeth.3578, which gives more robust but less sensitive results compared to the HMM. Option 'edivisive' (DEFAULT) works like option 'dnacopy' but uses the e.divisive function from the ecp package for segmentation.

strand

Find copy-numbers only for the specified strand. One of c('+', '-', '*').

R

method-edivisive: The maximum number of random permutations to use in each iteration of the permutation test (see e.divisive). Increase this value to increase accuracy on the cost of speed.

sig.lvl

method-edivisive: The level at which to sequentially test if a proposed change point is statistically significant (see e.divisive). Increase this value to find more breakpoints.

eps

method-HMM: Convergence threshold for the Baum-Welch algorithm.

init

method-HMM: One of the following initialization procedures:

standard

The negative binomial of state '2-somy' will be initialized with mean=mean(counts), var=var(counts). This procedure usually gives good convergence.

random

Mean and variance of the negative binomial of state '2-somy' will be initialized with random values (in certain boundaries, see source code). Try this if the standard procedure fails to produce a good fit.

max.time

method-HMM: The maximum running time in seconds for the Baum-Welch algorithm. If this time is reached, the Baum-Welch will terminate after the current iteration finishes. Set max.time = -1 for no limit.

max.iter

method-HMM: The maximum number of iterations for the Baum-Welch algorithm. Set max.iter = -1 for no limit.

num.trials

method-HMM: The number of trials to find a fit where state most.frequent.state is most frequent. Each time, the HMM is seeded with different random initial values.

eps.try

method-HMM: If code num.trials is set to greater than 1, eps.try is used for the trial runs. If unset, eps is used.

num.threads

method-HMM: Number of threads to use. Setting this to >1 may give increased performance.

count.cutoff.quantile

method-HMM: A quantile between 0 and 1. Should be near 1. Read counts above this quantile will be set to the read count specified by this quantile. Filtering very high read counts increases the performance of the Baum-Welch fitting procedure. However, if your data contains very few peaks they might be filtered out. Set count.cutoff.quantile=1 in this case.

states

method-HMM: A subset or all of c("zero-inflation","0-somy","1-somy","2-somy","3-somy","4-somy",...). This vector defines the states that are used in the Hidden Markov Model. The order of the entries must not be changed.

most.frequent.state

method-HMM: One of the states that were given in states. The specified state is assumed to be the most frequent one. This can help the fitting procedure to converge into the correct fit.

algorithm

method-HMM: One of c('baumWelch','EM'). The expectation maximization ('EM') will find the most likely states and fit the best parameters to the data, the 'baumWelch' will find the most likely states using the initial parameters.

initial.params

method-HMM: A aneuHMM object or file containing such an object from which initial starting parameters will be extracted.

verbosity

method-HMM: Integer specifying the verbosity of printed messages.

Value

An aneuHMM object.

Author(s)

Aaron Taudt

Examples

## Get an example BED file with single-cell-sequencing reads
bedfile <- system.file("extdata", "KK150311_VI_07.bam.bed.gz", package="AneuFinderData")
## Bin the data into bin size 1Mp
binned <- binReads(bedfile, assembly='mm10', binsize=1e6,
                  chromosomes=c(1:19,'X','Y'))
## Find copy-numbers
model <- findCNVs(binned[[1]])
## Check the fit
plot(model, type='histogram')


ataudt/aneufinder documentation built on April 18, 2023, 4:20 a.m.