pipe.DEtools: Pipes for Group-wise Differential Expression Tools like...
In robertdouglasmorrison/DuffyNGS: Duffy Lab NGS Analysis Pipeline for RNA-seq, DNA-seq, ChIP-seq, and RIP-seq

pipe.DEtools

R Documentation

Pipes for Group-wise Differential Expression Tools like DESeq, EdgeR, SAM, etc.

Description

Wrapper functions to a family of published DE tools, to find significant differentially expressed genes between groups of samples.

Usage

pipe.DESeq(sampleIDset, speciesID = getCurrentSpecies(), annotationFile = "Annotation.txt", 
	optionsFile = "Options.txt", useMultiHits = TRUE, results.path = NULL, 
	groupColumn = "Group", colorColumn = "Color", folderName = "", 
	altGeneMap = NULL, altGeneMapLabel = NULL, targetID = NULL, Ngenes = 100, 
	geneColumnHTML = if (speciesID %in% MAMMAL_SPECIES) "NAME" else "GENE_ID", 
	keepIntergenics = FALSE, verbose = !interactive(), label = "", 
	doDE = TRUE, PLOT.FUN = NULL, ...)

pipe.EdgeR(sampleIDset, speciesID = getCurrentSpecies(), annotationFile = "Annotation.txt", 
	optionsFile = "Options.txt", useMultiHits = TRUE, results.path = NULL, 
	groupColumn = "Group", colorColumn = "Color", folderName = "", 
	altGeneMap = NULL, altGeneMapLabel = NULL, targetID = NULL, Ngenes = 100, 
	geneColumnHTML = if (speciesID %in% MAMMAL_SPECIES) "NAME" else "GENE_ID", 
	keepIntergenics = FALSE, verbose = !interactive(), label = "", 
	doDE = TRUE, PLOT.FUN = NULL, ...)

pipe.RankProduct(sampleIDset, speciesID = getCurrentSpecies(), annotationFile = "Annotation.txt", 
	optionsFile = "Options.txt", useMultiHits = TRUE, results.path = NULL, 
	groupColumn = "Group", colorColumn = "Color", folderName = "", 
	altGeneMap = NULL, altGeneMapLabel = NULL, targetID = NULL, Ngenes = 100, 
	geneColumnHTML = if (speciesID %in% MAMMAL_SPECIES) "NAME" else "GENE_ID", 
	keepIntergenics = FALSE, verbose = !interactive(), label = "", 
	doDE = TRUE, PLOT.FUN = NULL, ...)

pipe.RoundRobin(sampleIDset, speciesID = getCurrentSpecies(), annotationFile = "Annotation.txt", 
	optionsFile = "Options.txt", useMultiHits = TRUE, results.path = NULL, 
	groupColumn = "Group", colorColumn = "Color", folderName = "", 
	altGeneMap = NULL, altGeneMapLabel = NULL, targetID = NULL, Ngenes = 100, 
	geneColumnHTML = if (speciesID %in% MAMMAL_SPECIES) "NAME" else "GENE_ID", 
	keepIntergenics = FALSE, verbose = !interactive(), label = "", 
	doDE = TRUE, PLOT.FUN = NULL, ...)

pipe.SAM(sampleIDset, speciesID = getCurrentSpecies(), annotationFile = "Annotation.txt", 
	optionsFile = "Options.txt", useMultiHits = TRUE, results.path = NULL, 
	groupColumn = "Group", colorColumn = "Color", folderName = "", 
	altGeneMap = NULL, altGeneMapLabel = NULL, targetID = NULL, Ngenes = 100, 
	geneColumnHTML = if (speciesID %in% MAMMAL_SPECIES) "NAME" else "GENE_ID", 
	keepIntergenics = FALSE, verbose = !interactive(), label = "", 
	doDE = TRUE, PLOT.FUN = NULL, ...)

Arguments

`sampleIDset`	Character vector of SampleIDs, giving the full set of samples that will take part in the DE calculations.
`speciesID`	The SpeciesID for one single species. The DE tools do not operate on multipe species at one time.
`annotationFile`	File of sample annotation details, which specifies all needed sample-specific information about the samples under study. See `DuffyNGS_Annotation`.
`optionsFile`	File of processing options, which specifies all processing parameters that are not sample specific. See `DuffyNGS_Options`.
`useMultiHits`	Logical. By default, all DE tools use the RPKM or READ values from the transcriptomes that correspond to keeping all aligned reads, including those alignments called "MultiHit" reads. If `FALSE`, this behavior can be restricted to only using uniquely mapped reads. Since the transcriptomes store both methods of counting gene abundance, changing how the DE results may be impacted is trivial.
`results.path`	The top level folder path for writing result files to. By default, read from the Options file entry 'results.path'.
`groupColumn`	Character string specifying one column of the annotation table, to give the group name for each sample.
`colorColumn`	Character string specifying one column of the annotation table, to give the group color for each sample.
`folderName`	Required character string, with no embedded blanks, used to name the folder of DE results that will be generated by the DE tool. Typically, use a short but informative name that describes the groups being compared.
`altGeneMap`	An alternate data frame of gene annotations, in a format identical to `getCurrentGeneMap`, that has the gene names and locations to be measured for differential expression. By default, use the standard built-in gene map for this species.
`altGeneMapLabel`	A character string identifier for the alternate gene map, that becomes part of all created path and file names to indicate the gene map that produced the transcriptomes used in this DE analysis.
`targetID`	Optional character string giving the target organism(s) being compared. Used by the gene plotting tools, defaults to the current target.
`Ngenes`	Number of gene to show in the HTML results and create gene plot images for.
`geneColumnHTML`	The name of one column in the current gene map, that contains the identifier shown in the HTML results. Some genomes require complex compound GeneIDs to give genomic location specificity, but are unwieldy for routine use. This argument lets a second simpler identifier be used as a surrogate GeneID.
`keepIntergenics`	Logical. By default, all transcriptomes keep gene expression values for defined intergenic "non-gene" regions defined in the gene map. These intergenic regions can be included or excluded from the DE fold change comparisons and results.
`label`	A character string that is passed to the gene plot tool, for inclusion in the main plot header.
`doDE`	Logical, controls whether the complete DE analysis is performed, or whether to just use results already present in the DE subfolder. Typically used to just remake gene plot images.
`PLOT.FUN`	An alternative function to use for generating gene plot images, that accepts a single GeneID as its argument. Use `NA` to suppress all plotting.
`...`	Other arguments passed down the to gene plotting function.

Details

Even though these 5 DE tools implement different methods of determining differential expression and take different input arguments, we use a common calling command line to simplify the use of all 5 tools and to standardize how they report their results.

The grouping column from the annotation file determines: how the samples are combined into groups, the names for all result files, and the number of different groups being compared. When more than 2 groups are being compared, a K-ways comparison is performed such that each one group is compared against all other groups combined, like a "Us against all other groups who are not us" strategy.

Each comparison creates a family of result files, with suffix names "UP" and "DOWN", to convey the direction of each comparison. Note that in the case of just 2 groups, the UP and DOWN results are virtually symmetric, but that is never true for 3+ group comparisons. Each comparison file uses a composite naming strategy combining <Group>.<Species>.<DEtool>.<DirectionSuffix>.

Value

A subfolder of result files, with a name constructed from the current species prefix and folderName. For each group name, a set of DE result files in various formats:

`Ratio.txt`	A tab delimited file of all genes in the species, sorted by fold-change and P-value, that includes all DE metrics returned by that DE tool.
`UP.html`
`DOWN.html`	A pair of HTML files of gene expression showing just the top `Ngenes` genes that are most differentially expressed for that comparison group and direction.
`All.GeneData.txt`	A tab delimited matrix file of all genes in the species, giving the expression values used by that DE tool (RPKM for some, READ counts for DESeq & EdgeR).
`Cluster & PCA`	A set of .PNG plots that visually summarize the similarity of the transcriptsomes. The Round Robin DE tool augments the clustering with "group average" transcriptomes as well.

Author(s)

Bob Morrison

References

  DESeq:       Anders,  Genome Biology (2010)
  EdgeR:       Robinson,  Biostatistics (2008)
  RankProduct: Breitling,  FEBS Letters (2004)
  RoundRobin:  Morrison (unpublished)
  SAM:         Tusher,  PNAS (2001)

robertdouglasmorrison/DuffyNGS
Duffy Lab NGS Analysis Pipeline for RNA-seq, DNA-seq, ChIP-seq, and RIP-seq

pipe.DEtools: Pipes for Group-wise Differential Expression Tools like...
In robertdouglasmorrison/DuffyNGS: Duffy Lab NGS Analysis Pipeline for RNA-seq, DNA-seq, ChIP-seq, and RIP-seq

Pipes for Group-wise Differential Expression Tools like DESeq, EdgeR, SAM, etc.

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Related to pipe.DEtools in robertdouglasmorrison/DuffyNGS...

R Package Documentation

Browse R Packages

We want your feedback!

robertdouglasmorrison/DuffyNGS Duffy Lab NGS Analysis Pipeline for RNA-seq, DNA-seq, ChIP-seq, and RIP-seq

pipe.DEtools: Pipes for Group-wise Differential Expression Tools like... In robertdouglasmorrison/DuffyNGS: Duffy Lab NGS Analysis Pipeline for RNA-seq, DNA-seq, ChIP-seq, and RIP-seq

Pipes for Group-wise Differential Expression Tools like DESeq, EdgeR, SAM, etc.

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Related to pipe.DEtools in robertdouglasmorrison/DuffyNGS...

R Package Documentation

Browse R Packages

We want your feedback!

robertdouglasmorrison/DuffyNGS
Duffy Lab NGS Analysis Pipeline for RNA-seq, DNA-seq, ChIP-seq, and RIP-seq

pipe.DEtools: Pipes for Group-wise Differential Expression Tools like...
In robertdouglasmorrison/DuffyNGS: Duffy Lab NGS Analysis Pipeline for RNA-seq, DNA-seq, ChIP-seq, and RIP-seq