pipe.AlignStats: Generate Alignment Success Stats Images.

pipe.AlignStatsR Documentation

Generate Alignment Success Stats Images.


Auxiliary pipeline step that creates a family of alignment statistic images to summarize all aspects of the alignment pipeline and its metrix.


pipe.AlignStats( sampleID, annotationFile = "Annotation.txt", optionsFile = "Options.txt", 
	results.path = NULL, banner = "", chunkSize = 500000, maxReads = NULL,
	mode = c( "normal", "QuickQC"), what = NULL, plot = TRUE, fastqFile = NULL)

pipe.AlignmentPie( sampleID, annotationFile = "Annotation.txt", optionsFile = "Options.txt", 
	results.path = NULL, banner = "", mode = c( "normal", "QuickQC"), 
	fastqFile = NULL, useUSR = TRUE)



The SampleID for this sample.


File of sample annotation details, which specifies all needed sample-specific information about the samples under study. See DuffyNGS_Annotation.


File of processing options, which specifies all processing parameters that are not sample specific. See DuffyNGS_Options.


The top level folder path for writing result files to. By default, read from the Options file entry 'results.path'.


Optional character string to add to each plot's main heading.


Integer. The buffer size to use for reading in and evaluating alignments. Most statistics are tallied and images printed after each buffer, to show incremental progress.


Optional integer to limit the number of alignments evaluated.


Controls the behavior of how alignments are interpreted. Mode "QuickQC" invokes the behavior for preliminary QC analysis. See pipe.QuickQC.


An optional character string that specifies which types of statistics to monitor. Default is to monitor every type of feature, or "SGBIDMA" where:

S: Sequences: features about chromosome, like read counts and percentages.

G: Genes: features about genes, like read counts and percentages for highly detected genes.

B: Bases: features about base calls, locations of mismatches, and nucleotide usage.

I,D: Insertions & Deletions: features about indel locations in the aligned reads.

M: MARs (Multiply Aligned Reads): features about reads hitting 2+ locations.

A: Align scores: features about the distribution of Bowtie alignment scores.


Optional character string for the original FASTQ file that was input to the alignment pipeline. Default is to look it up from annotation file.


Logical. Include a survey of USRs (Unique Short Reads) in the pie, to assess presence of empty adapters, Poly-N, etc.


This pipeline step tries to evaluate every aspect of how well the raw reads aligned to the target organism(s). It generates a large family of plot images, each of which shows some measure of alignment success or failure.


A family of files and plot images is created on disk under the subfolder AlignStats.

Also a list of read counts and percentages as returned from the alignment pie function that summarizes the alignment status of the entire sample.


Bob Morrison

robertdouglasmorrison/DuffyNGS documentation built on Sept. 1, 2024, 9:25 p.m.