preprocess: Pre-process bam files following Broad's best practices for...

Description Usage Arguments Details Examples

Description

This function provides a wrapper around the best practices described on GATK's website. If the link is broken google 'GATK best practices'

This aims to perform the following steps ( for DNA ):

For RNA GATK recommends a additional step of split n trim, which is not currently supported (contributions welcome !).

NOTE:

Some GATK tools use CPU threads while others use data threads, flowr tries to use efficiently make the best use of both/either depending on tool's compatibility.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
preprocess(x, outfile, samplename = opts_flow$get("samplename"),
  split_by_chr = opts_flow$get("split_by_chr"),
  java_exe = opts_flow$get("java_exe"),
  java_tmp = opts_flow$get("java_tmp"),
  gatk_jar = opts_flow$get("gatk_jar"),
  picard_dir = opts_flow$get("picard_dir"),
  samtools_exe = opts_flow$get("samtools_exe"), cpu_markdup = 1,
  mem_markdup = "-Xmx8g", cpu_target = opts_flow$get("cpu_target"),
  mem_target = "-Xmx32g", cpu_realign = opts_flow$get("cpu_realign"),
  mem_realign = "-Xmx4g",
  cpu_baserecalib = opts_flow$get("cpu_baserecalib"),
  mem_baserecalib = "-Xmx4g",
  cpu_printreads = opts_flow$get("cpu_printreads"),
  mem_printreads = "-Xmx4g", ref_fasta = opts_flow$get("ref_fasta"),
  gatk_target_opts = opts_flow$get("gatk_target_opts"),
  gatk_realign_opts = opts_flow$get("gatk_realign_opts"),
  gatk_baserecalib_opts = opts_flow$get("gatk_baserecalib_opts"),
  gatk_printreads_opts = opts_flow$get("gatk_printreads_opts"))

Arguments

java_exe

path to java

java_tmp

path to java tmp, can leave blank

cpu_markdup

not used.

cpu_target

number of threads used for GATK target creation step

q_obj

is provided output is a flow object, else a list of commands to run

java_mem_markdup

memory provided to java

Details

Flow following Broad's best practices for variant calling, starting from sorted bam

Examples

1
2
3
4
5
6
7
## Not run: 
## load options, including paths to tools and other parameters
load_opts(fetch_conf("ngsflows.conf"), check = FALSE)
out = bam_preprocess("my_wex.bam", samplename = "samp", split_by_chr = TRUE)


## End(Not run)

flow-r/ngsflows documentation built on May 16, 2019, 1:25 p.m.