get_summed_proportions: Get summed proportions
In NathanSkene/EWCE: Expression Weighted Celltype Enrichment

get_summed_proportions

R Documentation

Get summed proportions

Description

get_summed_proportions Given the target gene set, randomly sample gene lists of equal length, obtain the specificity of these and then obtain the mean specificity in each sampled list (and the target list).

Usage

get_summed_proportions(
  hits,
  sct_data,
  annotLevel,
  reps,
  no_cores = 1,
  geneSizeControl,
  controlledCT = NULL,
  control_network = NULL,
  store_gene_data = TRUE,
  verbose = TRUE
)

Arguments

`hits`	list of gene names. The target gene set.
`sct_data`	List generated using generate_celltype_data.
`annotLevel`	An integer indicating which level of `sct_data` to analyse (Default: 1).
`reps`	Number of random gene lists to generate (Default: 100, but should be >=10,000 for publication-quality results).
`no_cores`	Number of cores to parallelise bootstrapping `reps` over.
`geneSizeControl`	Whether you want to control for GC content and transcript length. Recommended if the gene list originates from genetic studies (Default: FALSE). If set to `TRUE`, then `hits` must be from humans.
`controlledCT`	[Optional] If not NULL, and instead is the name of a cell type, then the bootstrapping controls for expression within that cell type.
`control_network`	If `geneSizeControl=TRUE`, then must provide the control network.
`store_gene_data`	Store sampled gene data for every bootstrap iteration. When the number of bootstrap `reps` is very high (>=100k) and/or the number of genes in `hits` is very high, you may want to set `store_gene_data=FALSE` to avoid using excessive amounts of CPU memory.
`verbose`	Print messages.

Details

See bootstrap_enrichment_test for examples.

Value

A list containing three elements:

hit.cells: vector containing the summed proportion of expression in each cell type for the target list.
gene_data: data.table showing the number of time each gene appeared in the bootstrap sample.
bootstrap_data: matrix in which each row represents the summed proportion of expression in each cell type for one of the random lists
controlledCT: the controlled cell type (if applicable)