problem.target: problem target

Description Usage Arguments Value Author(s) Examples

Description

Compute target interval for a segmentation problem. This function repeatedly calls PeakSegFPOP_dir with different penalty values, until it finds an interval of penalty values with minimal label error. The calls to PeakSegFPOP are parallelized using psp_lapply. A time limit in minutes may be specified in a file problem.dir/target.minutes; the search will stop at a sub-optimal target interval if this many minutes has elapsed. Useful for testing environments with build time limits (travis).

Usage

1
2
3
problem.target(problem.dir, 
    verbose = getOption("PeakSegPipeline.verbose", 
        1))

Arguments

problem.dir

problemID directory in which coverage.bedGraph has already been computed. If there is a labels.bed file then the number of incorrect labels will be computed in order to find the target interval of minimal error penalty values.

verbose

Value

List of info related to target interval computation: target is the interval of log(penalty) values that achieve minimum incorrect labels (numeric vector of length 2), models and iterations are data.tables with one row per model.

Author(s)

Toby Dylan Hocking

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
library(PeakSegPipeline)
data(Mono27ac, envir=environment())
## Write the Mono27ac data set to disk.
problem.dir <- file.path(
  tempfile(),
  "H3K27ac-H3K4me3_TDHAM_BP",
  "samples",
  "Mono1_H3K27ac",
  "S001YW_NCMLS",
  "problems",
  "chr11-60000-580000")
dir.create(problem.dir, recursive=TRUE, showWarnings=FALSE)
write.table(
  Mono27ac$labels, file.path(problem.dir, "labels.bed"),
  col.names=FALSE, row.names=FALSE, quote=FALSE, sep="\t")
write.table(
  Mono27ac$coverage, file.path(problem.dir, "coverage.bedGraph"),
  col.names=FALSE, row.names=FALSE, quote=FALSE, sep="\t")

## Creating a target.minutes file stops the optimization after that
## number of minutes, resulting in an imprecise target interval, but
## saving time (to avoid NOTE on CRAN).
write.table(
  data.frame(minutes=0.05), file.path(problem.dir, "target.minutes"),
  col.names=FALSE, row.names=FALSE, quote=FALSE)

## declare future plan for parallel computation.
if(requireNamespace("future") && interactive()){
  future::plan("multiprocess")
}

## Compute target interval.
target.list <- problem.target(problem.dir, verbose=1)

## These are all the models computed in order to find the target
## interval.
print(target.list$models[order(penalty), list(
  penalty, log.penalty=log(penalty), peaks, total.loss, fn, fp)])

## This is the target interval in log(penalty) values.
print(target.list$target)

tdhock/PeakSegPipeline documentation built on March 3, 2020, 1:35 a.m.