View source: R/somaticWgsAnalysis.R
somaticWgsAnalysis pipeline identifies somatic variants within hole genome sequencing (WGS) data. The first pipeline starts with a reference alignment step followed by co-cleaning to increase the alignment quality. Six different variant calling pipelines are then implemented separately to identify somatic mutations.
Somatic-caller-identified variants are then annotated. Annotated VCF are converted into MAF file finally.
1 |
tumor_file |
Tumor bam to file to perform the variant calling. |
normal_file |
Normal bam to file to perform the variant calling. |
threads |
Number of threads to use in the analysis. |
ref |
Path for the reference genome to use for the alignment (fasta format) and the corresponding indexes generated with bwa index and a dictionary index file generated by CreateSequenceDictionary gatk tool. |
out_path |
Path where the output of the analysis will be saved. |
muse |
Path of MuSE binary. |
gatk4 |
Path of GATK4 binary. |
samtools_mpileup |
Path of samtools mpileup binary. |
af_only_gnomad |
Genome aggregation database used as a germline resource. Have to be base on the same reference genome as 'ref'. gnomAD |
somatic_sniper |
Path of SomaticSniper binary. |
bwa |
Path of bwa binary. |
samblaster |
Path of samblaster binary. |
samtools |
Path of samtools binary. |
sambamba |
Path of sambamba binary |
indel_candidates |
For the somatic workflow, the best-practice recommendation is to run the Manta SV and indel caller on the same set of samples first, then supply Manta's candidate indels as input to Strelka. Defined sample name have to be the same in both, Strelka2 and manta for a correct detection of the indel candidates file. |
centromeres_telomeres |
Bed file with the centromers and/or telomeres base on the same reference genome as 'ref'. |
varscan |
Path of Varscan2 binary. |
manta |
Path of manta binary. |
strelka2 |
Path of strelka2 binary. |
pindel |
Path of Pindel binary. |
perl |
Path of perl executable. |
fastq |
Fastq file to carry the analysis. If paried-end type, 'input_file' have to contain mate 1s and different pairs should be named "_R1" or "_R2". Allowed formats: fastq.gz, fq.gz, fastq, fq or bam. |
python_radia |
Path to the python binary with all the RADIA prerequisites. |
tumor_vcf_id |
Id of the tumor sample in vcf. By default 'TUMOR'. |
bam |
Bam file to carry the analysis. |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.