TCGAanalyze_DEA | R Documentation |
TCGAanalyze_DEA allows user to perform Differentially expression analysis (DEA), using edgeR package or limma to identify differentially expressed genes (DEGs). It is possible to do a two-class analysis.
TCGAanalyze_DEA performs DEA using following functions from edgeR:
edgeR::DGEList converts the count matrix into an edgeR object.
edgeR::estimateCommonDisp each gene gets assigned the same dispersion estimate.
edgeR::exactTest performs pair-wise tests for differential expression between two groups.
edgeR::topTags takes the output from exactTest(), adjusts the raw p-values using the False Discovery Rate (FDR) correction, and returns the top differentially expressed genes.
TCGAanalyze_DEA performs DEA using following functions from limma:
limma::makeContrasts construct matrix of custom contrasts.
limma::lmFit Fit linear model for each gene given a series of arrays.
limma::contrasts.fit Given a linear model fit to microarray data, compute estimated coefficients and standard errors for a given set of contrasts.
limma::eBayes Given a microarray linear model fit, compute moderated t-statistics, moderated F-statistic, and log-odds of differential expression by empirical Bayes moderation of the standard errors towards a common value.
limma::toptable Extract a table of the top-ranked genes from a linear model fit.
TCGAanalyze_DEA(
mat1,
mat2,
metadata = TRUE,
Cond1type,
Cond2type,
pipeline = "edgeR",
method = "exactTest",
fdr.cut = 1,
logFC.cut = 0,
batch.factors = NULL,
ClinicalDF = data.frame(),
paired = FALSE,
log.trans = FALSE,
voom = FALSE,
trend = FALSE,
MAT = data.frame(),
contrast.formula = "",
Condtypes = c()
)
mat1 |
numeric matrix, each row represents a gene, each column represents a sample with Cond1type |
mat2 |
numeric matrix, each row represents a gene, each column represents a sample with Cond2type |
metadata |
Add metadata |
Cond1type |
a string containing the class label of the samples in mat1 (e.g., control group) |
Cond2type |
a string containing the class label of the samples in mat2 (e.g., case group) |
pipeline |
a string to specify which package to use ("limma" or "edgeR") |
method |
is 'glmLRT' (1) or 'exactTest' (2) used for edgeR (1) Fit a negative binomial generalized log-linear model to the read counts for each gene (2) Compute genewise exact tests for differences in the means between two groups of negative-binomially distributed counts. |
fdr.cut |
is a threshold to filter DEGs according their p-value corrected |
logFC.cut |
is a threshold to filter DEGs according their logFC |
batch.factors |
a vector containing strings to specify options for batch correction. Options are "Plate", "TSS", "Year", "Portion", "Center", and "Patients" |
ClinicalDF |
a dataframe returned by GDCquery_clinic() to be used to extract year data |
paired |
boolean to account for paired or non-paired samples. Set to TRUE for paired case |
log.trans |
boolean to perform log cpm transformation. Set to TRUE for log transformation |
voom |
boolean to perform voom transformation for limma-voom pipeline. Set to TRUE for voom transformation |
trend |
boolean to perform limma-trend pipeline. Set to TRUE to go through limma-trend |
MAT |
matrix containing expression set as all samples in columns and genes as rows. Do not provide if mat1 and mat2 are used |
contrast.formula |
string input to determine coefficients and to design contrasts in a customized way |
Condtypes |
vector of grouping for samples in MAT |
table with DEGs containing for each gene logFC, logCPM, pValue,and FDR, also for each contrast
dataNorm <- TCGAbiolinks::TCGAanalyze_Normalization(dataBRCA, geneInfo)
dataFilt <- TCGAanalyze_Filtering(tabDF = dataBRCA, method = "quantile", qnt.cut = 0.25)
samplesNT <- TCGAquery_SampleTypes(colnames(dataFilt), typesample = c("NT"))
samplesTP <- TCGAquery_SampleTypes(colnames(dataFilt), typesample = c("TP"))
dataDEGs <- TCGAanalyze_DEA(
mat1 = dataFilt[,samplesNT],
mat2 = dataFilt[,samplesTP],
Cond1type = "Normal",
Cond2type = "Tumor"
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.