contrast_each_group_to_the_rest: contrast_each_group_to_the_rest
In celaref: Single-cell RNAseq cell cluster labelling by reference

Description Usage Arguments Details Value Examples

Produces a table of within-experiment differential expression results (for either query or reference experiment), where each group (cluster) is compared to the rest of the cells.

1
2
3

contrast_each_group_to_the_rest(dataset_se, dataset_name,
  groups2test = NA, num_cores = 1, n.group = Inf, n.other = n.group
  * 5)

`dataset_se`	Summarised experiment object containing count data. Also requires 'ID' and 'group' to be set within the cell information (see `colData()`)
`dataset_name`	Short, meaningful name for this dataset/experiment.
`groups2test`	An optional character vector specificing specific groups to check. By default (set to NA), all groups will be tested.
`num_cores`	Number of cores to use to run MAST jobs in parallel. Ignored if parallel package not available. Set to 1 to avoid parallelisation. Default = 1
`n.group`	How many cells to keep for each group in groupwise comparisons. Default = Inf
`n.other`	How many cells to keep from everything not in the group. Default = n.group * 5

Note that this function is slow, because it runs the differential expression. It only needs to be run once per dataset though (unless group labels change). Having package parallel installed is highly recomended.

If this function runs out of memory, consider specifying n.group and n.other to run on a subset of cells (taken from each group, and proportionally from the rest for each test). Alternatively use subset_cells_by_group to subset dataset_se for each group independantly.

Both reference and query datasets should be processed with this function.

The tables produced by this function (usually named something like de_table.datasetname) contain summarised results of MAST results. Each group is compared versus cells in the group, versus not in the group, (Ie. always a 2-group contrast, other groups information is ignored). As per MAST reccomendataions, the proportion of genes seen in each cell is included in the model.

A tibble the within-experiment de_table (differential expression table). This is a core summary of the individual experiment/dataset, which is used for the cross-dataset comparisons.

The table feilds won't neccesarily match across datasets, as they include cell annotations information. Important columns (used in downstream analysis) are:

ID: Gene identifier
ci_inner: Inner (conservative) 95% confidence interval of log2 fold-change.
fdr: Multiple hypothesis corrected p-value (using BH/FDR method)
group: Cells from this group were compared to everything else
sig_up: Significnatly differentially expressed (fdr < 0.01), with a positive fold change?
rank: Rank position (within group), ranked by CI inner, highest to lowest.
rescaled_rank: Rank scaled 0(top most overrepresented genes in group) - 1(top most not-present genes)
dataset: Name of dataset/experiment

de_table.demo_query  <- contrast_each_group_to_the_rest(
     demo_query_se, "a_demo_query")
     
de_table.demo_ref    <- contrast_each_group_to_the_rest(
     demo_ref_se, "a_demo_ref", num_cores=2)