runImpulseDE2: ImpulseDE2 wrapper

View source: R/ImpulseDE2_main.R

runImpulseDE2R Documentation

ImpulseDE2 wrapper

Description

Wrapper to run ImpulseDE2 on bulk omics count data. This wrapper can perform the entire analysis pipeline of ImpulseDE2 on its own if the right parameters are supplied. To run ImpulseDE2 on bulk omics count data, use the minimal parameter set:

  • matCountData

  • dfAnnotation

  • boolCaseCtrl

  • vecConfounders

Additionally, you can provide:

  • scaNProc to set the number of processes for parallelisation.

  • scaQThres to set the cut off for your DE gene list.

  • vecDispersionsExternal to supply external dispersion parameters which may be necessary depending on your confounding factors (runImpulseDE2 will tell you if it is necessary).

  • vecSizeFactorsExternal to supply external size factors.

  • boolVerbose to control stdout output.

Usage

runImpulseDE2(matCountData = NULL, dfAnnotation = NULL,
  boolCaseCtrl = FALSE, vecConfounders = NULL, scaNProc = 1,
  scaQThres = NULL, vecDispersionsExternal = NULL,
  vecSizeFactorsExternal = NULL, boolIdentifyTransients = FALSE,
  boolVerbose = TRUE)

Arguments

matCountData

(matrix genes x samples) [Default NULL] Read count data, unobserved entries are NA. Can be SummarizedExperiment object.

dfAnnotation

(data frame samples x covariates) Sample, Condition, Time (numeric), TimeCateg (str) (and confounding variables if given). Annotation table with covariates for each sample.

boolCaseCtrl

(bool) [Default FALSE] Whether to perform case-control analysis. Does case-only analysis if FALSE.

vecConfounders

(vector of strings number of confounding variables) Factors to correct for during batch correction. Have to supply dispersion factors if more than one is supplied. Names refer to columns in dfAnnotation.

scaNProc

(scalar) [Default 1] Number of processes for parallelisation.

scaQThres

(scalar) [Default NULL] FDR-corrected p-value cutoff for significance.

vecDispersionsExternal

(vector length number of genes in matCountData) [Default NULL] Externally generated list of gene-wise dispersion factors which overides DESeq2 generated dispersion factors.

vecSizeFactorsExternal

(vector length number of cells in matCountData) [Default NULL] Externally generated list of size factors which override size factor computation in ImpulseDE2.

boolIdentifyTransients

(bool) [Defaul FALSE] Whether to identify transiently activated or deactivated genes. This involves an additional fitting of sigmoidal models and hypothesis testing between constant, sigmoidal and impulse model.

boolVerbose

(bool) [Default TRUE] Whether to print progress to stdout.

Details

ImpulseDE2 is based on the impulse model proposed by Chechik and Koller (Chechik and Koller, 2009). The computational complexity of ImpulseDE2 is linear in the number of genes and linear in the number of samples.

Value

(object of class ImpulseDE2Object) This object can be treated as a list with 2 elements: (list length 2)

  • vecDEGenes (list number of genes) Genes IDs identified as differentially expressed by ImpulseDE2 at threshold scaQThres.

  • dfDEAnalysis (data frame samples x reported characteristics) Summary of fitting procedure and differential expression results for each gene.

    • Gene: Gene ID.

    • p: P-value for differential expression.

    • padj: Benjamini-Hochberg false-discovery rate corrected p-value for differential expression analysis.

    • loglik_full: Loglikelihood of full model.

    • loglik_red: Loglikelihood of reduced model.

    • df_full: Degrees of freedom of full model.

    • df_red: Degrees of freedom of reduced model

    • mean: Inferred mean parameter of constant model of first batch. From combined samples in case-ctrl.

    • allZero (bool) Whether there were no observed non-zero observations of this gene. If TRUE, fitting and DE analsysis were skipped and entry is NA.

    Entries only present in case-only DE analysis:

    • converge_impulse: Convergence status of optim for impulse model fit (full model).

    • converge_const: Convergence status of optim for constant model fit (reduced model).

    Entries only present in case-control DE analysis:

    • converge_combined: Convergence status of optim for impulse model fit to case and control samples combined (reduced model).

    • converge_case: Convergence status of optim for impulse model fit to samples of case condition (full model 1/2).

    • converge_control: Convergence status of optim for impulse model fit to samples of control condition (full model 2/2).

    Entries only present if boolIdentifyTransients is TRUE:

    • converge_sigmoid: Convergence status of optim for sigmoid model fit to samples of case condition.

    • impulseTOsigmoid_p: P-value of loglikelihood ratio test impulse model fit versus sigmoidal model on samples of case condition.

    • impulseTOsigmoid_padj: Benjamini-Hochberg false-discovery rate corrected p-value of loglikelihood ratio test impulse model fit versus sigmoid model on samples of case condition.

    • sigmoidTOconst_p: P-value of loglikelihood ratio test sigmoidal model fit versus constant model on samples of case condition.

    • sigmoidTOconst_padj: Benjamini-Hochberg false-discovery rate corrected p-value of loglikelihood ratio test sigmoidal model fit versus constant model on samples of case condition.

    • isTransient (bool) Whether gene is transiently activated or deactivated and differentially expressed.

    • isMonotonous (bool) Whether gene is not transiently activated or deactivated and differentially expressed. This scenario corresponds to a montonous expression level increase or decrease.

Author(s)

David Sebastian Fischer

See Also

Calls the following functions: processData, runDESeq2, computeNormConst, fitModels, fitSigmoidModels, runDEAnalysis. The following functions are additionally available to the user: fitSigmoidModels, plotGenes, plotHeatmap, runDEAnalysis, simulateDataSetImpulseDE2.

Examples

lsSimulatedData <- simulateDataSetImpulseDE2(
vecTimePointsA   = rep(seq(1,8),3),
vecTimePointsB   = NULL,
vecBatchesA      = NULL,
vecBatchesB      = NULL,
scaNConst        = 30,
scaNImp          = 10,
scaNLin          = 10,
scaNSig          = 10)
objectImpulseDE2 <- runImpulseDE2(
matCountData    = lsSimulatedData$matObservedCounts, 
dfAnnotation    = lsSimulatedData$dfAnnotation,
boolCaseCtrl    = FALSE,
vecConfounders  = NULL,
scaNProc        = 1 )
head(objectImpulseDE2$dfImpulseDE2Results)


YosefLab/ImpulseDE2 documentation built on Sept. 17, 2022, 2:45 a.m.