View source: R/appreci8R_classical.R
filterTarget | R Documentation |
appreci8R combines and filters the output of different variant calling tools according to the 'appreci8'-algorithm. In the 1st analysis step, all off-target calls are excluded from further analysis. A list of data.frames (one list per sample) with all on-target calls is returned.
filterTarget (output_folder, caller_name, caller_folder, caller_file_names_add,
caller_file_type, caller_snv_indel, caller_snv_names_add,
caller_indel_names_add, caller_chr = 1, caller_pos = 2,
caller_ref = 4, caller_alt = 5, targetRegions)
output_folder |
The folder to write the output files into. If an empty string is provided, no files are written out. |
caller_name |
Name of the variant calling tool (only necessary if an output folder is provided). |
caller_folder |
Folder containing the variant calling results. |
caller_file_names_add |
Suffix for naming the variant calling files. If an empty string is provided, it is assumed that the files only contain the sample name, e.g. "Sample1.vcf". |
caller_file_type |
File type of the variant calling results (".vcf" or ".txt"). |
caller_snv_indel |
SNVs and indels are reported in the same file ( |
caller_snv_names_add |
Suffix for naming the variant calling files containing SNVs (only evaluated if |
caller_indel_names_add |
Suffix for naming the variant calling files containing indels (only evaluated if |
caller_chr |
Column of the variant calling input containing information on chr (default: 1). |
caller_pos |
Column of the variant calling input containing information on pos (default: 2). |
caller_ref |
Column of the variant calling input containing information on ref (default: 4). |
caller_alt |
Column of the variant calling input containing information on alt (default: 5). |
targetRegions |
Data.frame object containing the target regions to be analyzed (bed-format: 1st column chr, 2nd column 0-based start pos, 3rd column 1-based end pos). Or: GRanges object containing the target regions to be analyzed. |
The function filterTarget
covers two steps: reading input and target filtration.
First, all files in caller_folder
of the file type caller_file_type
with the suffix caller_file_names_add
are read. Sample names are automatically derived from the file names (e.g. a sample name would be called "Sample1" if a file was called "Sample1.txt" and no suffix was defined; a sample would be called "Sample1.mutations" if a file was called "Sample1.mutations.vcf" and no suffix was defined, but "Sample1" if the suffix ".mutations" was defined).
If SNVs and indels are reported in separated files (in the same folder), caller_snv_indel==TRUE
and caller_snv_names_add
and caller_indel_names_add
are defined, input from two files per sample is read and automatically combined (e.g. a sample would be called "Sample1" if files "Sample1.SNV.vcf" and "Sample1.indel.vcf" are read and caller_snv_indel==TRUE
, caller_snv_names_add
was defined as ".SNV" and caller_indel_names_add
was defined as ".indel").
Subsequently, the read variant calling results are filtered according to the defined target region. All off-target calls are excluded from further analysis.
A list of data.frames is returned. Every list element contains the information on one sample. Every data.frame contains the columns: the SampleID (taken from the input file names), Chr, Pos, Ref and Alt.
Sarah Sandmann <sarah.sandmann@uni-muenster.de>
appreci8R
, appreci8Rshiny
, normalize
, annotate
, combineOutput
, evaluateCovAndBQ
, determineCharacteristics
, finalFiltration
output_folder<-""
target<-data.frame(chr = c("2","4","12","17","21","X"),
start = c(25469500,106196950,12046280,7579470,36164400,15838363),
end = c(25469510,106196960,12046350,7579475,36164410,15838366))
caller_folder <- system.file("extdata", package = "appreci8R")
targetFiltered<-filterTarget(output_folder, "GATK", caller_folder,
".rawMutations", ".vcf", TRUE, "", "",
targetRegions = target)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.