pipe.Transcriptome: Turn Alignments into Wiggle Tracks and Transcriptomes.

Description Usage Arguments Details Value Author(s) See Also


Pipeline step that generates transcriptomes of gene expression levels for one sample. The alignment BAM files are converted to (optionally) strand specific and unique/multi-hit wiggle tracks, and then read pileup depth is measured for all genes in all target species.


pipe.Transcriptome( sampleID, annotationFile = "Annotation.txt", optionsFile = "Options.txt", 
	speciesID = NULL, results.path = NULL, dataType = NULL, altGeneMap = NULL, 
	altGeneMapLabel = NULL, loadWIG = FALSE, verbose = TRUE, mode = "normal", exonsOnly = NULL)



The SampleID for this sample. This SampleID keys for one row of annotation details in the annotation file, for getting sample-specific details. The SampleID is also used as a sample-specific prefix for all files created during the processing of this sample.


File of sample annotation details, which specifies all needed sample-specific information about the samples under study. See DuffyNGS_Annotation.


File of processing options, which specifies all processing parameters that are not sample specific. See DuffyNGS_Options.


The SpeciesID of the target species to calculate a transcriptome for. By default, transcriptome for all species in the current target are generated.


The top level folder path for writing result files to. By default, read from the Options file entry 'results.path'.


The type of raw data contained in the FASTQ files. By default, read from the 'DataType' field of the Annotation file.


An alternate data frame of gene annotations, in a format identical to getCurrentGeneMap, that has the gene names and locations to be measured for read pileup depth. By default, use the standard built-in gene map for each species.


A character string identifier for the alternate gene map, that becomes part of all created path and file names to indicate the gene map that produced those transcriptomes.


Logical, should all wiggle track data structures be rebuilt from the alignment BAM files. By default, the wiggle files are only created if they do not yet exist for this sample. Incorrect read sense of strand specific reads can be fixed by updating the annotation field and then rerunning just this pipeline step with loadWIG=TRUE.


Logical, send progress information to standard out.


Controls the behavior of how alignments are loaded into the the wiggle track data objects. Optional mode "QuickQC" invokes the behavior for preliminary QC analysis. See pipe.QuickQC.


Logical, controls which regions of the gene footprint get integrated into the gene's expression calculation. When FALSE, the entire region of the gene's extent, without regard to exons/introns/UTRs is used. This is better for poorly annotated genomes like plasmodium. When TRUE, only the regions inside exons are used for measuring the gene's abundance. This mode is better for carefully annotated genomes with overlapping genes; but it runs slower. When NULL, looks up the TRUE/FALSE value for this sample from the Annotation Table argument 'ExonsOnly'.


This pipeline step turns the BAM files of alignment results into tabular files of gene expression data.


A family of files is created:


A file and subfolder of wiggle track data objects for this sample may be written to the 'wig' subfolder of results. These contain the strand and unique/multihit details of all chromosomes for all species in the sample.


A file of gene expression for all genes in the gene annotation, for each species. These are written to the 'transcript' subfolder of results.


Bob Morrison

See Also

pipe.Alignment for the previous pipeline step that turns raw FASTQ data to BAM alignments. pipe.TranscriptToHTML for turning the transcriptome into a web page with gene pileup images. pipe.DiffExpression for turning a set of transcriptomes into files of differential gene expression.

robertdouglasmorrison/DuffyNGS documentation built on Dec. 7, 2018, 8:01 a.m.