somaticWgsAnalysis: somaticWgsAnalysis

Description Usage Arguments

View source: R/somaticWgsAnalysis.R

Description

somaticWgsAnalysis pipeline identifies somatic variants within hole genome sequencing (WGS) data. The first pipeline starts with a reference alignment step followed by co-cleaning to increase the alignment quality. Six different variant calling pipelines are then implemented separately to identify somatic mutations.

Somatic-caller-identified variants are then annotated. Annotated VCF are converted into MAF file finally.

Usage

1

Arguments

tumor_file

Tumor bam to file to perform the variant calling.

normal_file

Normal bam to file to perform the variant calling.

threads

Number of threads to use in the analysis.

ref

Path for the reference genome to use for the alignment (fasta format) and the corresponding indexes generated with bwa index and a dictionary index file generated by CreateSequenceDictionary gatk tool.

out_path

Path where the output of the analysis will be saved.

muse

Path of MuSE binary.

gatk4

Path of GATK4 binary.

samtools_mpileup

Path of samtools mpileup binary.

af_only_gnomad

Genome aggregation database used as a germline resource. Have to be base on the same reference genome as 'ref'. gnomAD

somatic_sniper

Path of SomaticSniper binary.

bwa

Path of bwa binary.

samblaster

Path of samblaster binary.

samtools

Path of samtools binary.

sambamba

Path of sambamba binary

indel_candidates

For the somatic workflow, the best-practice recommendation is to run the Manta SV and indel caller on the same set of samples first, then supply Manta's candidate indels as input to Strelka. Defined sample name have to be the same in both, Strelka2 and manta for a correct detection of the indel candidates file.

centromeres_telomeres

Bed file with the centromers and/or telomeres base on the same reference genome as 'ref'.

varscan

Path of Varscan2 binary.

manta

Path of manta binary.

strelka2

Path of strelka2 binary.

pindel

Path of Pindel binary.

perl

Path of perl executable.

fastq

Fastq file to carry the analysis. If paried-end type, 'input_file' have to contain mate 1s and different pairs should be named "_R1" or "_R2". Allowed formats: fastq.gz, fq.gz, fastq, fq or bam.

python_radia

Path to the python binary with all the RADIA prerequisites.

tumor_vcf_id

Id of the tumor sample in vcf. By default 'TUMOR'.

bam

Bam file to carry the analysis.


msubirana/ergWgsTools documentation built on June 8, 2020, 8:07 a.m.