getDE.BID.2G: Differential Expression Analysis and Differential Activity...

View source: R/pipeline_functions.R

getDE.BID.2GR Documentation

Differential Expression Analysis and Differential Activity Analysis Between 2 Sample Groups Using Bayesian Inference

Description

getDE.BID.2G is a function performs differential gene expression analysis and differential driver activity analysis between control group (parameter G0) and experimental group (parameter G1).

Usage

getDE.BID.2G(
  eset,
  output_id_column = NULL,
  G1 = NULL,
  G0 = NULL,
  G1_name = NULL,
  G0_name = NULL,
  method = "Bayesian",
  family = gaussian,
  pooling = "full",
  logTransformed = TRUE,
  verbose = TRUE
)

Arguments

eset

ExpressionSet class object, contains gene expression data or driver activity data.

output_id_column

character, the column names of Biobase::fData(eset). This option is useful when the eset expression matrix is at transcript-level, and user is expecting to see the gene-level comparison statistics. If NULL, rownames of the Biobase::fData(eset) will be used. Default is NULL.

G1

a vector of characters, the sample names of experimental group.

G0

a vecotr of characters, the sample names of control group.

G1_name

character, the name of experimental group (e.g. "Male"). Default is "G1".

G0_name

character, the name of control group (e.g. "Female"). Default is "G0".

method

character, users can choose between "MLE" and "Bayesian". "MLE", the maximum likelihood estimation, will call generalized linear model(glm/glmer) to perform data regression. "Bayesian", will call Bayesian generalized linear model (bayesglm) or multivariate generalized linear mixed model (MCMCglmm) to perform data regression. Default is "Bayesian".

family

character or family function or the result of a call to a family function. This parameter is used to define the model's error distribution. See ?family for details. Currently, options are gaussian, poisson, binomial(for two-group sample classes)/category(for multi-group sample classes)/ordinal(for multi-group sample classes with class_ordered=TRUE). If set with gaussian or poission, the response variable in the regression model will be the expression level, and the independent variable will be the sample's phenotype. If set with binomial, the response variable in the regression model will be the sample phenotype, and the independent variable will be the expression level. For binomial, category and ordinal input, the family will be automatically reset, based on the sample's class level and the setting of class_ordered. Default is gaussian.

pooling

character, users can choose from "full","no" and "partial". "full", use probes as independent observations. "no", use probes as independent variables in the regression model. "partial", use probes as random effect in the regression model. Default is "full".

logTransformed

logical, if TRUE, log tranformation of the expression value will be performed.

verbose

logical, if TRUE, sample names of both groups will be printed. Default is TRUE.

Value

Return a data frame. Rows are genes/drivers, columns are "ID", "logFC", "AveExpr", "t", "P.Value", "adj.P.Val", "Z-statistics", "Ave.G1" and "Ave.G0". Names of the columns may vary from different group names. Sorted by P.Value.

Examples

mat <- matrix(c(0.50099,-1.2108,-1.0524,
                0.34881,-0.13441,0.87112,
                1.84579,-2.0356,-2.6025,
                1.62954,1.88281,1.29604),nrow=2,byrow=TRUE)
rownames(mat) <- c('A1','A2')
colnames(mat) <-  c('Case-rep1','Case-rep2','Case-rep3',
                'Control-rep1','Control-rep2','Control-rep3')
tmp_eset <- generate.eset(mat,feature_info=data.frame(row.names=rownames(mat),
            probe=rownames(mat),gene=rep('GeneX',2),
            stringsAsFactors = FALSE))
res1 <- getDE.BID.2G(tmp_eset,output_id_column='probe',
        G1=c('Case-rep1','Case-rep2','Case-rep3'),
        G0=c('Control-rep1','Control-rep2','Control-rep3'))
res2 <- getDE.BID.2G(tmp_eset,output_id_column='gene',
        G1=c('Case-rep1','Case-rep2','Case-rep3'),
        G0=c('Control-rep1','Control-rep2','Control-rep3'))
res3 <- getDE.BID.2G(tmp_eset,output_id_column='gene',
        G1=c('Case-rep1','Case-rep2','Case-rep3'),
        G0=c('Control-rep1','Control-rep2','Control-rep3'),
        pooling='partial')
## Not run: 
analysis.par <- list()
analysis.par$out.dir.DATA <- system.file('demo1','driver/DATA/',package = "NetBID2")
NetBID.loadRData(analysis.par=analysis.par,step='ms-tab')
phe_info <- Biobase::pData(analysis.par$cal.eset)
each_subtype <- 'G4'
G0  <- rownames(phe_info)[which(phe_info$`subgroup`!=each_subtype)] # get sample list for G0
G1  <- rownames(phe_info)[which(phe_info$`subgroup`==each_subtype)] # get sample list for G1
DE_gene_BID <- getDE.BID.2G(eset=analysis.par$cal.eset,
                                G1=G1,G0=G0,
                                G1_name=each_subtype,
                                G0_name='other')
DA_driver_BID <- getDE.BID.2G(eset=analysis.par$merge.ac.eset,
                                G1=G1,G0=G0,
                                G1_name=each_subtype,
                                G0_name='other')

## End(Not run)

jyyulab/NetBID documentation built on Dec. 23, 2024, 6:34 a.m.