subset_cells_by_group: subset_cells_by_group

Description Usage Arguments Details Value See Also Examples


Utility function to randomly subset very large datasets (that use too much memory). Specify a maximum number of cells to keep per group and use the subsetted version to analysis.


subset_cells_by_group(dataset_se, = 1000)



Summarised experiment object containing count data. Also requires 'ID' and 'group' to be set within the cell information.

How many cells to keep for each group. Default = 1000


The resulting differential expression table de_table will have reduced statistical power. But as long as enough cells are left to reasonably accurately calculate differnetial expression between groups this should be enough for celaref to work with.

Also, this function will lose proportionality of groups (there'll be n.groups or less of each). Consider using the parameters in contrast_each_group_to_the_rest or contrast_the_group_to_the_rest - which subsets non-group cells independantly for each group. That may be more approriate for tissue type samples which would have similar compositions of cells.

So this function is intended for use when either; the proportionality isn't relevant (e.g. FACs purified cell populations), or, the data is just too big to work with otherwise.

Cells are randomly sampled, so set the random seed (with set.seed()) for consistant results across runs.


dataset_se A hopefully more managably subsetted version of the inputted dataset_se.

See Also

contrast_each_group_to_the_rest For alternative method of subsetting cells proportionally.


dataset_se.30pergroup <- subset_cells_by_group(demo_query_se,

MonashBioinformaticsPlatform/celaref documentation built on June 5, 2019, 11:35 a.m.