metrics_alignqc: Generate quality control metrics for aligned sequences

View source: R/analysis.R

metrics_alignqcR Documentation

Generate quality control metrics for aligned sequences

Description

This function generates quality control metrics for an aligned sequence. Works for WGS and Panel data. WGS metrics will be generated if bait and target intervals are nor given, otherwise Panel metrics will be generateds. Target and bait BEDs can be converted to interval format using Picards BedToIntervalList functions. For example java -jar ~/tools/picard/build/libs/picard.jar BedToIntervalList I=PCFv2newchem_primary_targets.bed O=PCFv2newchem_primary_targets.interval_list SD=~/Scratch/RefGenome/hs37d5.fa For more information about interval format check: https://gatk.broadinstitute.org/hc/en-us/articles/360036883931-BedToIntervalList-Picard-

Usage

metrics_alignqc(
  bin_samtools = build_default_tool_binary_list()$bin_samtools,
  bin_picard = build_default_tool_binary_list()$bin_picard,
  bin_bedtools = build_default_tool_binary_list()$bin_bedtools,
  ref_genome = build_default_reference_list()$HG19$reference$genome,
  bam = "",
  output_dir = ".",
  verbose = FALSE,
  batch_config = build_default_preprocess_config(),
  tmp_dir = ".",
  mapq = 0,
  bi = build_default_reference_list()$HG19$panel$PCF_V3$intervals$bi,
  ti = build_default_reference_list()$HG19$panel$PCF_V3$intervals$ti,
  ri = build_default_reference_list()$HG19$rnaseq$intervals$ri,
  ref_flat = build_default_reference_list()$HG19$rnaseq$reference$ref_flat,
  method = "tg",
  mode = "local",
  executor_id = make_unique_id("alignQC"),
  task_name = "alignQC",
  time = "48:0:0",
  threads = 3,
  ram = 4,
  update_time = 60,
  wait = FALSE,
  hold = NULL
)

Arguments

bin_samtools

Path to samtools executable. Default path tools/samtools/samtools.

bin_picard

Path to picard executable. Default path tools/picard/build/libs/picard.jar.

bin_bedtools

Path to bedtools executable. Default path tools/bedtools2/bin/bedtools. Only required if analyzing panel data.

ref_genome

Path to input file with the reference genome sequence.

bam

Path to the BAM file.

output_dir

Path to the output directory.

verbose

Enables progress messages. Default False.

tmp_dir

Path to tmp directory.

mapq

Minimum MapQ for Picard Wgs metrics. Default 0.

bi

Bait capture target interval for panel data. Requires ti and off_target and on_tar arguments. Interval format.

ti

Primary target intervals for panel data. Requires bi and off_tar and on_tar argmunets. Interval format.

ri

Path to ribosomal intervals file. Only for RNAseq.

ref_flat

Path to flat refrence. Only for RNAseq.

mode

Type of data to generate metrics for. Default tg. Options "wgs","tg","rna"

executor_id

Task EXECUTOR ID. Default "gatherBAM"

task_name

Task name. Default "gatherBAM"

ram

RAM in GB to use. Default 4 Gb.

thread

Threads to use. Default 3.

Details

Off target BED can be created by generating the complementary regions from the target BED using bedtools complement function. For example ~/tools/bedtools2/bin/bedtools complement -i PCFv2newchem_capture_targets.bed -g ~/Scratch/RefGenome/hs37d5.fa.fai > Probes_Off_target_regions.bed Note: This BED has to be sorted using the same reference as the BAM files. Using different reference to generate the BED from the one used to align the BAM may cause issues even when sorted due to scaffold chromosomes.


TearsWillFall/ULPwgs documentation built on April 18, 2024, 3:45 p.m.