Description Usage Arguments Details Value Examples
View source: R/coexpression_pathway_enrichment.R
This function computes enrichment p-values of pathways in a co-expression network. See details.
1 2 3 4 5 6 7 8 9 10 | coexpression_pathway_enrichment(
net,
pathways,
min.gene = 5,
max.gene = 100,
iter = 10000,
seed = NULL,
na.rm = F,
neg.treat = "error"
)
|
net |
matrix or data.frame. A gene x gene matrix representing edge weights between genes in a co-expression network. Gene names must be available as row and column names. See details. |
pathways |
list. List of pathways where each entry contains the genes in each pathway.
Pathway names may be provided as |
min.gene |
integer. Each pathway must have at least |
max.gene |
integer. Each pathway must have at most |
iter |
integer. The number of random iterations or the number of random gene sets to compute the null distribution. |
seed |
integer or NULL. Random number generator seed. |
na.rm |
logical. Should edges with |
neg.treat |
character representing how negative values in |
To compute the enrichment p-value of a pathway in a co-expression network,
we define the score of a pathway as the the sum of weights in net
of all possible edges between the genes in the pathway. The enrichment p-value is
then defined as the probability that the score of the pathway is at least as big as
a random gene set with the same number of genes.
To get the null distribution, we generate a number of (iter
) random gene sets
where each gene set consists of the same number of randomly selected genes, compute
their scores, and fit a normal distribution.
Enrichment of a pathway is computed only if at least min.gene
and
at most max.gene
genes from the pathway are available in net
.
This criteria helps to avoid too small or to large pathways.
Each value in net
should represent the relative probability that
the corresponding edge is true. In other words, larger values should
represent higher confidence in corresponding edges.
If the sign of values in net
represents positive or negative
associations between genes, you probably should provide absolute values.
If you still want to allow negative values in net
,
you may set neg.treat = "allow"
.
In this case, any negative value will represent lower confidence than
any non-negative value.
net
must be a square matrix.
Gene names must be available as row and column names.
Gene names must be unique.
net
must be symmetric when rows and columns are identically ordered.
Diagonal entries are ignored.
A data.frame
with the following columns.
pathway |
Pathway name taken from |
n.gene |
Number of genes from the pathway available in |
p |
p-value for the pathway (computed using a fitted normal distribution as null). |
p.empirical |
Empirical p-value for the pathway. |
fdr |
False discovery rate computed using Benjamini-Hochberg method. |
fdr.empirical |
Empirical false discovery rate computed using Benjamini-Hochberg method. |
1 2 3 4 5 6 7 8 9 10 11 12 13 | genes = c('TP53', 'RBM3', 'SF3', 'LIM12', 'ATM', 'TMEM160', 'BCL2L1', 'MDM2',
'PDR', 'MEG3', 'EGFR', 'CD96', 'KEAP1', 'SRSF1', 'TSEN2')
dummy_net = matrix(rnorm(length(genes)^2), nrow = length(genes), dimnames = list(genes, genes))
dummy_net = abs((dummy_net + t(dummy_net))/2) # symmetric network
dummy_pathways = list(pathway1=c('TP53', 'RBM3', 'SF1', 'SF5'),
pathway2=c('LIM12', 'MDM2', 'BCL2L1', 'TMEM160', 'ATM'),
pathway3=c('EGFR', 'TP53', 'CD96', 'SRSF1', 'RBM14'))
enrich_res = coexpression_pathway_enrichment(net = dummy_net,
pathways = dummy_pathways,
min.gene = 3)
print(enrich_res)
n_sig = sum(enrich_res$fdr <= 0.05)
print(sprintf('Number of significantly enriched pathways: %d', n_sig))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.