pipe.RNAseq: DuffyNGS Pipelines

pipe.RNAseqR Documentation

DuffyNGS Pipelines

Description

The top level all-in-one function for turning raw FASTQ files into wiggle tracks, transcriptomes, etc.

Usage

pipe.RNAseq(  sampleID = NULL, annotationFile = "Annotation.txt", optionsFile = "Options.txt")
pipe.DNAseq(  sampleID = NULL, annotationFile = "Annotation.txt", optionsFile = "Options.txt")
pipe.ChIPseq( sampleID = NULL, annotationFile = "Annotation.txt", optionsFile = "Options.txt")
pipe.RIPseq( sampleID = NULL, annotationFile = "Annotation.txt", optionsFile = "Options.txt")

pipeline( sampleID = NULL, annotationFile = "Annotation.txt", optionsFile = "Options.txt")

Arguments

sampleID

The SampleID for this sample. This SampleID keys for one row of annotation details in the annotation file, for getting sample-specific details. The SampleID is also used as a sample-specific prefix for all files created during the processing of this sample.

annotationFile

File of sample annotation details, which specifies all needed sample-specific information about the samples under study. See DuffyNGS_Annotation.

optionsFile

File of processing options, which specifies all processing parameters that are not sample specific. See DuffyNGS_Options.

Details

Turn raw sequencing reads into final results for that type of data, building various files of results along the way. pipeline is a wrapper function that looks at the annotation field DataType to dispatch the correct pipeline tool. Main components common to all data types include:

pipe.PreAlignTasks Setup. Create the results directories (and cleanup of any previous files) prior to running the pipeline on this sample.

pipe.Alignment Alignment. Turn FASTQ data into the various types of alignments via the Bowtie2 alignment program.

pipe.Transcriptome Transcription (if RNA-seq data) or ChIP Peaks (if ChIP-seq data). Turn alignments into wiggle tracks, and then generate transcriptome(s) or ChIP peak calls for all target species.

pipe.PostAlignTasks Optional post-alignment tasks, such as de novo assembly of No-Hit reads.

For a typical dataset, this top levelpipeline can take many hours to complete, and is usually configured to be run as a batch job submitted to a compute cluster.

Value

a family of data files written to disk, and a text log of all details sent to standard out.

Author(s)

Bob Morrison


robertdouglasmorrison/DuffyNGS documentation built on March 24, 2024, 4:16 p.m.