analyzeGenesetTopology: Analyze Gene List Topology

View source: R/analyzeGeneListTopology.R

analyzeGenesetTopologyR Documentation

Analyze Gene List Topology

Description

Analyzes the topology of a gene list using gene correlation data and dimension-reduction techniques.

Usage

analyzeGenesetTopology(
  genesOfInterest,
  Sample_Type = "normal",
  Tissue = "all",
  crossComparisonType = c("PCA", "variantGenes", "coCorrelativeGenes", "pathwayEnrich"),
  pathwayType = c("simple"),
  setComparisonCutoff = "Auto",
  pathwayEnrichment = FALSE,
  pValueCutoff = 0.05,
  numTopGenesToPlot = "Auto",
  alternativeTSNE = TRUE,
  numClusters = "Auto",
  outputPrefix = "CorrelationAnalyzeR_Output",
  returnDataOnly = TRUE,
  pool = NULL,
  makePool = FALSE
)

Arguments

genesOfInterest

A vector of genes to analyze or the name of an official MSIGDB term.

Sample_Type

Type of RNA Seq samples used to create correlation data. Either "all", "normal", or "cancer". Can be a single value for all genes, or a vector corresponding to genesOfInterest. Default: "normal"

Tissue

Which tissue type should gene correlations be derived from? Can be a single value for all genes, or a vector corresponding to genesOfInterest. Run getTissueTypes() to see available tissues. Default: "all"

crossComparisonType

The type of topology tests to run. (see details). Default: c("PCA", "variantGenes", "coCorrelativeGenes", "pathwayEnrich")

pathwayType

Which pathway annotations should be considered? Options listed in correlationAnalyzeR::MSIGDB_Geneset_Names See details of ?getTERM2GENE for more info. Default: "simple".

setComparisonCutoff

Only relevant for co-correlation analysis – the number of genes which must aggree for a gene to be considered co-correlative within the input gene list. Default: "Auto"

pathwayEnrichment

Logic. If TRUE, pathway enrichment will be performed on variant genes – if 'variantGenes' selected – and/or on co-correlative genes – if "coCorrelativeGenes" selected. Default: FALSE.

pValueCutoff

Numeric. The p value cutoff applied when running all pathway enrichment tests. Default: .05.

numTopGenesToPlot

When creating a heatmap of the top co-correlative or top variant genes, how many genes should be plotted on the y axis? Default: "Auto"

alternativeTSNE

Logical. If TRUE, then a TSNE will be run as an alternative to PCA for visualizing large input gene lists. This is highly recommended as 100+ member gene lists cannot be visualized otherwise. Default: TRUE.

numClusters

The number of clusters to create with hclust or TSNE analysis.

outputPrefix

Prefix for saved files. Should include directory info. Ignored if returnDataOnly = TRUE. Default: "CorrelationAnalyzeR_Output"

returnDataOnly

if TRUE will return only a list of analysis results. Default: TRUE

pool

an object created by pool::dbPool to accessing SQL database. It will be created if not supplied.

makePool

Logical. Should a pool be created if one is not supplied? Default: FALSE.

Details

analyzeGenesetTopology() uses the matrix of co-expression correlations to perform dimensionality reduction, clustering, and it also performs pathway enrichment. See the vignette for usage examples and information about the output format.

Cross Comparison Types: - variantGenes: These are the genes which best explain variation between genes within the input list. These genes can divide a list into functional groups. - coCorrelativeGenes: These are the genes which best explain similarities between all genes in the input list. These genes can explain what biological processes unify the input genes. - PCA: This is a dimensionality reduction technique for exploring the topology of a gene list. The PCA analyses here employes hclust to divide the gene list into functional clusters. If the input list is > 100 genes, RTsne will be used for visualization. - pathwayEnrich: Cluster profiler's enricher function will be run on the input gene list.

Value

A list of correlations for input genes, and the results of chosen analysis + visualizations.

Examples

genesOfInterest <- c("CDK12", "AURKB", "SFPQ", "NFKB1", "BRCC3", "BRCA2", "PARP1",
                     "DHX9", "SON", "AURKA", "SETX", "BRCA1", "ATMIN")
res <- correlationAnalyzeR::analyzeGenesetTopology(genesOfInterest = genesOfInterest,
                                 Sample_Type = "cancer", returnDataOnly = TRUE,
                                 Tissue = "brain",
                                 crossComparisonType = c("variantGenes", "PCA"))


res <- correlationAnalyzeR::analyzeGenesetTopology(genesOfInterest = "HALLMARK_ADIPOGENESIS")


Bishop-Laboratory/correlationAnalyzeR documentation built on June 28, 2022, 8:31 p.m.