multisample_mutect2_gatk: Multiregion parallelization across Mutect2 Gatk Variant...

View source: R/gatk.R

multisample_mutect2_gatkR Documentation

Multiregion parallelization across Mutect2 Gatk Variant Calling for multiple samples

Description

This function functions calls Mutect2 across multiple regions in parallel. If a vector of tumour samples are provided these will be processed in multi-sample mode. To run in tumour-normal mode suppply a single tumour and normal sample. If no normal is supplied this will run in tumour only. TO DO// Implement mitochondrial mode feature

Usage

multisample_mutect2_gatk(
  sif_gatk = build_default_sif_list()$sif_gatk,
  bin_vep = build_default_tool_binary_list()$bin_vep,
  bin_bcftools = build_default_tool_binary_list()$bin_bcftools,
  bin_samtools = build_default_tool_binary_list()$bin_samtools,
  bin_bgzip = build_default_tool_binary_list()$bin_bgzip,
  bin_tabix = build_default_tool_binary_list()$bin_tabix,
  sample_sheet = NULL,
  bam_dir = "",
  normal_id = "",
  patient_id = "",
  pattern = "bam$",
  ref_genome = build_default_reference_list()$HG19$reference$genome,
  germ_resource = build_default_reference_list()$HG19$variant$germ_reference,
  biallelic_db = build_default_reference_list()$HG19$variant$biallelic_reference,
  db_interval = build_default_reference_list()$HG19$variant$biallelic_reference,
  pon = build_default_reference_list()$HG19$panel$PCF_V3$variant$pon_muts,
  chr = c(1:22, "X", "Y", "MT"),
  method = "single",
  regions = NULL,
  output_dir = ".",
  verbose = FALSE,
  filter = TRUE,
  annotate = TRUE,
  extract_pass = TRUE,
  orientation = TRUE,
  mnps = FALSE,
  contamination = TRUE,
  clean = FALSE,
  header = TRUE,
  sep = "\t",
  batch_config = build_default_preprocess_config(),
  threads = 4,
  ram = 8,
  mode = "local",
  executor_id = make_unique_id("multiSampleMutect2"),
  task_name = "multiSampleMutect2",
  time = "48:0:0",
  update_time = 60,
  wait = FALSE,
  hold = NULL
)

Arguments

sif_gatk

REQUIRED Path to gatk sif file.

bin_vep

REQUIRED Path to VEP binary.

bin_bcftools

REQUIRED Path to bcftools binary file.

bin_samtools

REQUIRED Path to samtools binary file.

bin_bgzip

REQUIRED Path to bgzip binary file.

bin_tabix

REQUIRED Path to tabix binary file.

sample_sheet

OPTIONAL Path to sheet with sample information.

bam_dir

OPTIONAL Path to directory with BAM files.

normal_id

OPTIONAL Path to directory with BAM files.

ref_genome

REQUIRED Path to reference genome fasta file.

germ_resource

REQUIREDPath to germline resources vcf file.

pon

OPTIONAL Path to panel of normal.

chr

OPTIONAL Chromosomes to analyze. Default c(1:22,"X","Y","MT")

method

OPTIONAL Default variant calling method. Default single. Options "single","multi"

regions

OPTIONAL Regions to analyze. If regions for parallelization are not provided then these will be infered from BAM file.

output_dir

OPTIONAL Path to the output directory.

verbose

OPTIONAL Enables progress messages. Default False.

filter

OPTIONAL Filter Mutect2. Default TRUE.

orientation

OPTIONAL Produce read orientation inforamtion. Default FALSE

mnps

OPTIONAL Report MNPs in vcf file.

contamination

OPTIONAL Produce sample cross-contamination reports. Default TRUE.

threads

OPTIONAL Number of threads to split the work. Default 4

ram

OPTIONAL RAM memory to asing to each thread. Default 4

mode

REQUIRED Where to parallelize. Default local. Options "local","batch"

executor_id

Task EXECUTOR ID. Default "recalCovariates"

task_name

Task name. Default "recalCovariates"

time

OPTIONAL If batch mode. Max run time per job. Default "48:0:0"

update_time

OPTIONAL If batch mode. Job update time in seconds. Default 60.

wait

OPTIONAL If batch mode wait for batch to finish. Default FALSE

hold

OPTIONAL HOld job until job is finished. Job ID.

output_name

OPTIONAL Name for the output. If not given the name of one of the samples will be used.

Details

For more information read: https://gatk.broadinstitute.org/hc/en-us/articles/360037593851-Mutect2


TearsWillFall/ULPwgs documentation built on April 18, 2024, 3:45 p.m.