This package is under development.
The package is based on GATK4 guidelines and provides docker containerized tools and shiny application interface. Containerization through docker provides high level of reproducibility and easy deployment without the need of installation of each tool and its requirements. In addition, implemented deployment of multiple docker containers enables parallel processing for functions without multi-threading options (Mutect2).
1.Input
1.Config
2.Metadata
2.Shiny
Users must have R installed. Rstudio is optional but recommended IDE.
To install SNPsom package from GitHub, it is convenient to use the package devtools, namely function: install_github.
# install devtools
install.packages("devtools")
# install RNAseqCNV package
devtools::install_github(repo = "honzee/SNPsom")
The tools are deployed in separate docker containers. The images are automatically downloaded from dockerhub. Docker must be installed and run in order for the package to work. More information on docker installation here: https://docs.docker.com/get-docker/
Each SNPsom tool can be run separately or via function wrappers. trimmomatic_fastqc wrapper provides options for adapter trimming and subsequent checking by fastqc. SNPsom_wrapper deploys tool for somatic variant analysis. For more information on functions and their parameters, please refer to the function documentation.
The necessary for all of the functions are config and metadata files.
Config file defines the input directory (input_dir), output directory (out_dir), reference fasta file (reference), reference version (reference_version) and Funcotator formated annotation directory (annotation_dir). Change the file directories accordingly but keep the key words identical as the example below:
input_dir = "/Path/to/input_dir"
output_dir = "/Path/to/output_dir"
reference = "/Path/to/reference.fasta"
reference_version = "hg19orhg38"
annotation_dir = "/Path/to/funcotator_formatted_dir"
Metadata file contains information about the analyzed samples. The required columns are: sample - sample name; status - "t" or "c" for tumour of control sample, patient - patient id, fastq1 - file name for the forward fastq file, fastq2 file name for the reverse fastq file
| sample | patient | status | fastq1 | fastq2 | |--------------|---------|--------|-----------------------|-----------------------| | PE01_tumour | PE01 | t | PE01_tumour_R1.fastq | PE01_tumour_R2.fastq | | PE01_control | PE01 | c | PE01_control_R1.fastq | PE01_control_R2.fastq | | PE02_tumour | PE02 | t | PE02_tumour_1.fastq | PE02_tumour_2.fastq | | PE02_control | PE02 | c | PE02_control_R1.fastq | PE02_control_2.fastq |
SNPsom_wrapper allows deployment of alignment (bwa-mem2), marking duplicates (MarkDuplicatesSpark), variant calling (Mutect2), variant filtering (FilterMutectCalls) and variant annotation (Funcotator) in single command.
SNPsom_wrapper(metadata = metadata, input_dir = input_dir, output_dir = output_dir, reference = reference, t = t, parallel = parallel, reference_version = "hg19/hg38", annotation_dir = annotation_dir)
Shiny app allows for deployment of SNPsom_wrapper by providing the config and metadata files through interactive interface.
launchApp()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.