get.mser.interpolation: Interpolate MSER dependency on the tag count
In spp: ChIP-Seq Processing Pipeline

Description Usage Arguments Details Value

MSER generally decreases with increasing sequencing depth. This function interpolates the dependency of MSER on tag counts as a log-log linear function. The log-log fit is used to estimate the depth of sequencing required to reach desired target.fold.enrichment.

get.mser.interpolation(signal.data, 
  control.data, 
  target.fold.enrichment = 5, 
  n.chains = 10, 
  n.steps = 6, 
  step.size = 1e+05, 
  chains = NULL, 
  test.agreement = 0.99, 
  return.chains = F, 
  enrichment.background.scales = c(1), 
  excluded.steps = c(seq(2, n.steps - 2)), ...)

`signal.data`	signal chromosome tag vector list
`control.data`	control chromosome tag vector list
`target.fold.enrichment`	target MSER for which the depth should be estimated
`n.steps`	number of steps in each subset chain.
`step.size`	Either number of tags or fraction of the dataset size, see `step.size` parameter for `get.mser`.
`test.agreement`	Fraction of the detected peaks that should agree between the full and subsampled datasets. See `test.agreement` parameter for `get.mser`
`n.chains`	number of random subset chains
`chains`	optional structure of pre-calculated chains (e.g. generated by an earlier call with `return.chains=T`.
`return.chains`	whether to return peak predictions calculated on random chains. These can be passed back using `chains` argument to skip subsampling/prediction steps, and just recalculate the depth estimate for a different MSER.
`enrichment.background.scales`	see `enrichment.background.scales` parameter for `get.mser`
`excluded.steps`	Intermediate subsampling steps that should be excluded from the chains to speed up the calculation. By default, all intermediate steps except for first two and last two are skipped. Adding intermediate steps improves interpolation at the expense of computational time.
`...`	additional parameters are passed to `get.mser`

To simulate sequencing growth, the method calculates peak predictions on random chains. Each chain is produced by sequential random subsampling of the original data. The number of steps in the chain indicates how many times the random subsampling will be performed.

Normally reurns a list, specifying for each backgroundscale:

`prediction`	estimated sequencing depth required to reach specified target MSER
`log10.fit`	linear fit model, a result of `lm()` call

If return.chains=T, the above structure is returned under interpolation field, along with chains field containing results of find.binding.positions calls on subsampled chains.

spp documentation built on May 30, 2019, 5:03 p.m.