find_gene_statistics: Retrieve statistics for all or a subset of genes present in...

View source: R/subset_metaanalysis.R

find_gene_statisticsR Documentation

Retrieve statistics for all or a subset of genes present in the meta-analysis

Description

For each gene retrieve mean, median, and standard deviation of the log2FC and the number of times it is ecountered differentially expressed in the various comparisons.

Usage

find_gene_statistics(dataset, genes = NULL, top = NULL)

Arguments

dataset

list of 3 lists, the first with gene IDs (called names), the second with the adjusted p-values for each genes (called adjpval) and the third with log2 fold changes (called log2FC). Each one of these three lists have a number of sublists corresponding to the gene IDs, adjusted p-values, and log2FCs, respectively from the differential expression analyses of each comparison.

genes

vector of gene names. Limit the results only to the elements of genes. Cannot be set together with top.

top

integer. Limit the results to the top number of genes with the highest occurence in the dataset. Cannot be set together with genes.

Value

a dataframe with 6 columns and as many rows as genes. The column names reports the gene names used in the dataset. occurences indicates the number of times a gene is found differentially expressed. median_log2FC, mean_log2FC and sd_log2FC: median, mean, and standard deviations of all the log2 fold changes in which the gene is encountered across the comparisons. pseudotscore: this statistics is calculated as mean_log2FC/(sd_log2FC/sqrt(occurences)). Genes with a high pseudo-t-score have high absolute fold change that does not vary across comparisons and are found differentially expressed in a relatively high number of comparisons.

Examples

# create a list of lists were only the first 500 most significant genes with adjusted p-value < 0.05 and fold change >1.5 or < -1.5 are included
data(list_array) #load data
list_array.05_fc1.5_max500 <- subset_metanalysis(dataset=list_array, adjpval = 0.05, abslog2FC = log2(1.5), max_n_genes = 500 )
#retrieve the statistics for the most frequent 500 DE genes
find_gene_statistics(dataset=list_array.05_fc1.5_max500, top=500)

Ilarius/metaDEA documentation built on May 6, 2023, 6:47 p.m.