erssa_deseq2_parallel: Run DESeq2 for computed sample combinations with parallel...

View source: R/DE_DESeq2.R

erssa_deseq2_parallelR Documentation

Run DESeq2 for computed sample combinations with parallel computing

Description

erssa_deseq2_parallel function performs the same calculation as erssa_deseq2 except now employs BiocParallel to perform parallel DESeq2 calculations. This function runs DESeq2 Wald test to identify differentially expressed (DE) genes for each sample combination computed by comb_gen function. A gene is considered to be differentially expressed by defined padj (Default=0.05) and log2FoldChange (Default=1) values. As an option, the function can also save the DESeq2 result tables as csv files to the drive.

Usage

erssa_deseq2_parallel(
  count_table.filtered = NULL,
  combinations = NULL,
  condition_table = NULL,
  control = NULL,
  cutoff_stat = 0.05,
  cutoff_Abs_logFC = 1,
  save_table = FALSE,
  path = ".",
  num_workers = 1
)

Arguments

count_table.filtered

Count table pre-filtered to remove non- to low- expressing genes. Can be the output of count_filter function.

combinations

List of combinations that is produced by comb_gen function.

condition_table

A condition table with two columns and each sample as a row. Column 1 contains sample names and Column 2 contains sample condition (e.g. Control, Treatment).

control

One of the condition names that will serve as control.

cutoff_stat

The cutoff in padj for DE consideration. Genes with lower padj pass the cutoff. Default = 0.05.

cutoff_Abs_logFC

The cutoff in abs(log2FoldChange) for differential expression consideration. Genes with higher abs(log2FoldChange) pass the cutoff. Default = 1.

save_table

Boolean. When set to TRUE, function will, in addition, save the generated DESeq2 result table as csv files. The files are saved on the drive in the working directory in a new folder named "ERSSA_DESeq2_table". Tables are saved separately by the replicate level. Default = FALSE.

path

Path to which the files will be saved. Default to current working directory.

num_workers

Number of workers for parallel computing. Default=1.

Details

The main function calls DESeq2 functions to perform Wald test for each computed combinations generated by comb_gen. In all tests, the pair-wise test sets the condition defined in the object "control" as the control condition.

In typical usage, after each test, the list of differentially expressed genes are filtered by padj and log2FoldChange values and only the filtered gene names are saved for further analysis. However, it is also possible to save all of the generated result tables to the drive for additional analysis that is outside the scope of this package.

Value

A list of list of vectors. Top list contains elements corresponding to replicate levels. Each child list contains elements corresponding to each combination at the respective replicate level. The child vectors contain differentially expressed gene names.

Author(s)

Zixuan Shao, Zixuanshao.zach@gmail.com

References

Morgan M, Obenchain V, Lang M, Thompson R, Turaga N (2018). BiocParallel: Bioconductor facilities for parallel evaluation. R package version 1.14.1, https://github.com/Bioconductor/BiocParallel.

Love MI, Huber W, Anders S (2014). “Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2.” Genome Biology, 15, 550. doi: 10.1186/s13059-014-0550-8.

Examples

# load example filtered count_table, condition_table and combinations
# generated by comb_gen function
# example dataset containing 1000 genes, 4 replicates and 5 comb. per rep.
# level
data(count_table.filtered.partial, package = "ERSSA")
data(combinations.partial, package = "ERSSA")
data(condition_table.partial, package = "ERSSA")

# run erssa_deseq2_parallel with heart condition as control
deg.partial = erssa_deseq2_parallel(count_table.filtered.partial,
combinations.partial, condition_table.partial, control='heart',
num_workers=1)


zshao1/ERSSA documentation built on July 19, 2023, 9:20 p.m.