pipe.VariantSummary: Summarize all Chromosomes of SNP Calls into one File

View source: R/pipe.VariantCalls.R

pipe.VariantSummaryR Documentation

Summarize all Chromosomes of SNP Calls into one File

Description

Combines all the separate chromosome SNP call results for a sample into a single file, with optional filtering by exon, score, etc.

Usage

pipe.VariantSummary(sampleID, speciesID = getCurrentSpecies(), annotationFile = "Annotation.txt", 
		optionsFile = "Options.txt", results.path = NULL, seqIDset = NULL, min.depth = 1, 
		min.score = 5, exonOnly = FALSE, snpOnly = FALSE)

Arguments

sampleID

Character string of one SampleID that already has SNP calls done.

annotationFile

File of sample annotation details, which specifies all needed sample-specific information about the samples under study. See DuffyNGS_Annotation.

optionsFile

File of processing options, which specifies all processing parameters that are not sample specific. See DuffyNGS_Options.

speciesID

The SpeciesID of the target species to call SNPs for. By default, use the current species.

results.path

The top level folder path for writing result files to. By default, read from the Options file entry 'results.path'.

seqIDset

Optional character vector of SeqIDs. Default is to combine and summarize SNPs from all chromosome.

min.depth
min.score

Filtering arguments, to only keep SNP calls that meet minimum criteria about depth of read coverage and the BCFTOOLS CALL score metric.

exonOnly

Optional filtering to remove all SNP calls that lie in intergenic or intron regions. Intergenic SNPs are not traditionally as interesting. And SNPs in introns are vary likely false SNPs due to alignment methods and/or poor genome annotation of true exon boundaries.

snpOnly

Options filtering to remove all SNP calls that do not see an alternate allele as the most frequent observed base. There may be cases, like mixed infections or due to the limitations of alignment precision, where a minor allele gets flagged as a SNP, but has less the 50% of the observed base calls. So the site is both called a SNP and yet still matches the reference base call. This option removes these 'paradoxical' SNP calls.

Value

One file is written under the VariantCalls subfolder:

Summary.VCF.txt

One final file of SNP sites, after merging all chromosomes and cleaning up much of the BCFTOOLS details. Includes a column "ALT_AA" that tries to suggest if the SNP changes the amino acid sequence of the protein.

Author(s)

Bob Morrison

See Also

pipe.VariantCalls for doing the initial SNP calls on each chromosome.

pipe.VariantComparison for finding SNPs that are diffentially detected between groups.


robertdouglasmorrison/DuffyNGS documentation built on March 24, 2024, 4:16 p.m.