generate_bootstrap_plots: Generate bootstrap plots

View source: R/generate_bootstrap_plots.r

generate_bootstrap_plotsR Documentation

Generate bootstrap plots

Description

generate_bootstrap_plots takes a gene list and a single cell type transcriptome dataset and generates plots which show how the expression of the genes in the list compares to those in randomly generated gene lists.

Usage

generate_bootstrap_plots(
  sct_data = NULL,
  hits = NULL,
  bg = NULL,
  genelistSpecies = NULL,
  sctSpecies = NULL,
  output_species = "human",
  method = "homologene",
  reps = 100,
  annotLevel = 1,
  geneSizeControl = FALSE,
  full_results = NULL,
  listFileName = paste0("_level", annotLevel),
  adj_pval_thresh = 0.05,
  facets = "CellType",
  scales = "free_x",
  save_dir = file.path(tempdir(), "BootstrapPlots"),
  show_plot = TRUE,
  verbose = TRUE
)

Arguments

sct_data

List generated using generate_celltype_data.

hits

List of gene symbols containing the target gene list. Will automatically be converted to human gene symbols if geneSizeControl=TRUE.

bg

List of gene symbols containing the background gene list (including hit genes). If bg=NULL, an appropriate gene background will be created automatically.

genelistSpecies

Species that hits genes came from (no longer limited to just "mouse" and "human"). See list_species for all available species.

sctSpecies

Species that sct_data is currently formatted as (no longer limited to just "mouse" and "human"). See list_species for all available species.

output_species

Species to convert sct_data and hits to (Default: "human"). See list_species for all available species.

method

R package to use for gene mapping:

  • "gprofiler" : Slower but more species and genes.

  • "homologene" : Faster but fewer species and genes.

  • "babelgene" : Faster but fewer species and genes. Also gives consensus scores for each gene mapping based on a several different data sources.

reps

Number of random gene lists to generate (Default: 100, but should be >=10,000 for publication-quality results).

annotLevel

An integer indicating which level of sct_data to analyse (Default: 1).

geneSizeControl

Whether you want to control for GC content and transcript length. Recommended if the gene list originates from genetic studies (Default: FALSE). If set to TRUE, then hits must be from humans.

full_results

The full output of bootstrap_enrichment_test for the same gene list.

listFileName

String used as the root for files saved using this function.

adj_pval_thresh

Adjusted p-value threshold of celltypes to include in plots.

facets

[Deprecated] Please use rows and cols instead.

scales

Are scales shared across all facets (the default, "fixed"), or do they vary across rows ("free_x"), columns ("free_y"), or both rows and columns ("free")?

save_dir

Directory where the BootstrapPlots folder should be saved, default is a temp directory.

show_plot

Print the plot.

verbose

Print messages.

Value

Saves a set of pdf files containing graphs and returns the file where they are saved. These will be saved with the file name adjusted using the value of listFileName. The files are saved into the 'BootstrapPlot' folder. Files start with one of the following:

  • qqplot_noText: sorts the gene list according to how enriched it is in the relevant cell type. Plots the value in the target list against the mean value in the bootstrapped lists.

  • qqplot_wtGSym: as above but labels the gene symbols for the highest expressed genes.

  • bootDists: rather than just showing the mean of the bootstrapped lists, a boxplot shows the distribution of values

  • bootDists_LOG: shows the bootstrapped distributions with the y-axis shown on a log scale

Examples

## Load the single cell data
sct_data <- ewceData::ctd()

## Set the parameters for the analysis
## Use 5 bootstrap lists for speed, for publishable analysis use >10000
reps <- 5

## Load the gene list and get human orthologs
hits <- ewceData::example_genelist()

## Bootstrap significance test,
##  no control for transcript length or GC content
## Use pre-computed results to speed up example
full_results <- EWCE::example_bootstrap_results()

### Skip this for example purposes
# full_results <- EWCE::bootstrap_enrichment_test(
#    sct_data = sct_data,
#    hits = hits,
#    reps = reps,
#    annotLevel = 1,
#    sctSpecies = "mouse",
#    genelistSpecies = "human"
# )

output <- EWCE::generate_bootstrap_plots(
    sct_data = sct_data,
    hits = hits,
    reps = reps,
    full_results = full_results,
    sctSpecies = "mouse",
    genelistSpecies = "human",
    annotLevel = 1
)

NathanSkene/EWCE documentation built on April 10, 2024, 1:02 a.m.