Utility function to randomly subset very large datasets (that use too much memory). Specify a maximum number of cells to keep per group and use the subsetted version to analysis.
subset_cells_by_group(dataset_se, n.group = 1000)
Summarised experiment object containing count data. Also requires 'ID' and 'group' to be set within the cell information.
How many cells to keep for each group. Default = 1000
The resulting differential expression table de_table will have reduced statistical power. But as long as enough cells are left to reasonably accurately calculate differnetial expression between groups this should be enough for celaref to work with.
Also, this function will lose proportionality of groups (there'll be n.groups or less of each). Consider using the n.group/n.other parameters in contrast_each_group_to_the_rest or contrast_the_group_to_the_rest - which subsets non-group cells independantly for each group. That may be more approriate for tissue type samples which would have similar compositions of cells.
So this function is intended for use when either; the proportionality isn't relevant (e.g. FACs purified cell populations), or, the data is just too big to work with otherwise.
Cells are randomly sampled, so set the random seed (with set.seed()) for consistant results across runs.
dataset_se A hopefully more managably subsetted version of the inputted dataset_se.
contrast_each_group_to_the_rest For alternative method
of subsetting cells proportionally.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.