filterTarget: Excludes all off-target calls from further analysis.

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/appreci8R_classical.R

Description

appreci8R combines and filters the output of different variant calling tools according to the 'appreci8'-algorithm. In the 1st analysis step, all off-target calls are excluded from further analysis. A list of data.frames (one list per sample) with all on-target calls is returned.

Usage

1
2
3
4
filterTarget (output_folder, caller_name, caller_folder, caller_file_names_add,
              caller_file_type, caller_snv_indel, caller_snv_names_add,
              caller_indel_names_add, caller_chr = 1, caller_pos = 2,
              caller_ref = 4, caller_alt = 5, targetRegions)

Arguments

output_folder

The folder to write the output files into. If an empty string is provided, no files are written out.

caller_name

Name of the variant calling tool (only necessary if an output folder is provided).

caller_folder

Folder containing the variant calling results.

caller_file_names_add

Suffix for naming the variant calling files. If an empty string is provided, it is assumed that the files only contain the sample name, e.g. "Sample1.vcf".

caller_file_type

File type of the variant calling results (".vcf" or ".txt").

caller_snv_indel

SNVs and indels are reported in the same file (TRUE or FALSE).

caller_snv_names_add

Suffix for naming the variant calling files containing SNVs (only evaluated if caller_snv_indel==FALSE).

caller_indel_names_add

Suffix for naming the variant calling files containing indels (only evaluated if caller_snv_indel==FALSE).

caller_chr

Column of the variant calling input containing information on chr (default: 1).

caller_pos

Column of the variant calling input containing information on pos (default: 2).

caller_ref

Column of the variant calling input containing information on ref (default: 4).

caller_alt

Column of the variant calling input containing information on alt (default: 5).

targetRegions

Data.frame object containing the target regions to be analyzed (bed-format: 1st column chr, 2nd column 0-based start pos, 3rd column 1-based end pos). Or: GRanges object containing the target regions to be analyzed.

Details

The function filterTarget covers two steps: reading input and target filtration.

First, all files in caller_folder of the file type caller_file_type with the suffix caller_file_names_add are read. Sample names are automatically derived from the file names (e.g. a sample name would be called "Sample1" if a file was called "Sample1.txt" and no suffix was defined; a sample would be called "Sample1.mutations" if a file was called "Sample1.mutations.vcf" and no suffix was defined, but "Sample1" if the suffix ".mutations" was defined).

If SNVs and indels are reported in separated files (in the same folder), caller_snv_indel==TRUE and caller_snv_names_add and caller_indel_names_add are defined, input from two files per sample is read and automatically combined (e.g. a sample would be called "Sample1" if files "Sample1.SNV.vcf" and "Sample1.indel.vcf" are read and caller_snv_indel==TRUE, caller_snv_names_add was defined as ".SNV" and caller_indel_names_add was defined as ".indel").

Subsequently, the read variant calling results are filtered according to the defined target region. All off-target calls are excluded from further analysis.

Value

A list of data.frames is returned. Every list element contains the information on one sample. Every data.frame contains the columns: the SampleID (taken from the input file names), Chr, Pos, Ref and Alt.

Author(s)

Sarah Sandmann <sarah.sandmann@uni-muenster.de>

See Also

appreci8R, appreci8Rshiny, normalize, annotate, combineOutput, evaluateCovAndBQ, determineCharacteristics, finalFiltration

Examples

1
2
3
4
5
6
7
8
9
output_folder<-""
target<-data.frame(chr = c("2","4","12","17","21","X"),
                   start = c(25469500,106196950,12046280,7579470,36164400,15838363),
                   end = c(25469510,106196960,12046350,7579475,36164410,15838366))
caller_folder <- system.file("extdata", package = "appreci8R")

targetFiltered<-filterTarget(output_folder, "GATK", caller_folder,
                             ".rawMutations", ".vcf", TRUE, "", "",
                             targetRegions = target)

sandmanns/appreci8R documentation built on Dec. 7, 2020, 12:32 a.m.