runEnrichmentAnalysis: Feature Set Enrichment Analysis
In bioFAM/MOFA: Multi-Omics Factor Analysis (MOFA)

Description Usage Arguments Details Value Examples

Method to perform feature set enrichment analysis on the feature loadings.
The input is a data structure containing the feature set membership, usually relating biological pathways to genes.
The output is a matrix of dimensions (number_gene_sets,number_factors) with p-values and other statistics.

runEnrichmentAnalysis(object, view, feature.sets, factors = "all",
  local.statistic = c("loading", "cor", "z"),
  global.statistic = c("mean.diff", "rank.sum"),
  statistical.test = c("parametric", "cor.adj.parametric",
  "permutation"), transformation = c("abs.value", "none"),
  min.size = 10, nperm = 1000, cores = 1, p.adj.method = "BH",
  alpha = 0.1)

`object`	a `MOFAmodel` object.
`view`	name of the view to perform enrichment on. Make sure that the feature names of the feature set file match the feature names in the MOFA model.
`feature.sets`	data structure that holds feature set membership information. Must be either a binary membership matrix (rows are feature sets and columns are features) or a list of feature set indexes (see vignette for details).
`factors`	character vector with the factor names to perform enrichment on. Alternatively, a numeric vector with the index of the factors. Default is all factors.
`local.statistic`	the feature statistic used to quantify the association between each feature and each factor. Must be one of the following: loading (the output from MOFA, default), cor (the correlation coefficient between the factor and each feature), z (a z-scored derived from the correlation coefficient).
`global.statistic`	the feature set statisic computed from the feature statistics. Must be one of the following: "mean.diff" (difference in means between the foreground set and the background set, default) or "rank.sum" (difference in rank sums between the foreground set and the background set).
`statistical.test`	the statistical test used to compute the significance of the feature set statistics under a competitive null hypothesis. Must be one of the following: "parametric" (very liberal, default), "cor.adj.parametric" (very conservative, adjusts for the inter-gene correlation), "permutation" (non-parametric, the recommended one if you can do sufficient number of permutations)
`transformation`	optional transformation to apply to the feature-level statistics. Must be one of the following "none" or "abs.value" (default).
`min.size`	Minimum size of a feature set (default is 10).
`nperm`	number of permutations. Only relevant if statistical.test is set to "permutation". Default is 1000.
`cores`	number of cores to run the permutation analysis in parallel. Only relevant if statistical.test is set to "permutation". Default is 1.
`p.adj.method`	Method to adjust p-values factor-wise for multiple testing. Can be any method in p.adjust.methods(). Default uses Benjamini-Hochberg procedure.
`alpha`	FDR threshold to generate lists of significant pathways. Default is 0.1

This function relates the factors to pre-defined biological pathways by performing a gene set enrichment analysis on the loadings. The general idea is to compute an activity score for every pathway in each factor based on its corresponding gene loadings.
This function is particularly useful when a factor is difficult to characterise based only on the genes with the highest loading.
We provide several pre-build gene set matrices in the MOFAdata package. See https://github.com/bioFAM/MOFAdata for details.
The function we implemented is based on the pcgse function with some modifications. Please read this paper https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4543476 for details on the math.

a list with the following elements:

`feature.statistics`	feature statistics
`set.statistics`	feature-set statistics
`pval`	raw p-values
`pval.adj`	adjusted p-values
`sigPathways`	a list with enriched pathways

# Example on the CLL data
filepath <- system.file("extdata", "CLL_model.hdf5", package = "MOFAdata")
MOFAobject <- loadModel(filepath)

# perform Enrichment Analysis on mRNA data using pre-build Reactome gene sets
data("reactomeGS", package = "MOFAdata")
fsea.results <- runEnrichmentAnalysis(MOFAobject, view="mRNA", feature.sets=reactomeGS)

# heatmap of enriched pathways per factor at 1% FDR
plotEnrichmentHeatmap(fsea.results, alpha=0.01)

# plot number of enriched pathways per factor at 1% FDR
plotEnrichmentBars(fsea.results, alpha=0.01)

# plot top 10 enriched pathways on factor 5:
plotEnrichment(MOFAobject, fsea.results, factor=5,  max.pathways=10)