do.subsample: Subsample data

Description Usage Arguments Author(s) Examples

View source: R/do.subsample.R

Description

Method to subsample data. Can subsample by randomly selecting a desired number of cells from all samples (DEFAULT), subsample by specifying the exact number of cells for each sample (specify divide.by), or by subsampling the same number of cells from each sample based on the sample with the lowest count (specify divide.by and min.per). Useful to decrease total cells for generating dimensionality reduction plots (tSNE/UMAP).

Usage

1
do.subsample(dat, targets, samp.col, min.per, seed)

Arguments

dat

NO DEFAULT. Input dataframe with cells (rows) vs markers (columns).

targets

NO DEFAULT. List of downsample targets. If divide.by is specified, then must be a vector of subsample targets in the same order as the unique divide.by entries.

divide.by

DEFAULT = NULL. Character. Name of the column that reflects groupings of cells (sample names, group names etc) if you want to subsample by each.

min.per

DEFAULT = FALSE. If TRUE, and divide.by is specified, each sample contributes the same amount of data based on sample with lowest count.

seed

DEFAULT = 42. Numeric. Seed for reproducibility.

Author(s)

Thomas Ashhurst, thomas.ashhurst@sydney.edu.au Felix Marsh-Wakefield, felix.marsh-wakefield@sydney.edu.au

Examples

1
2
3
4
5
6
7
8
# Subsample 10,000 cells randomly from the total dataset
sub.dat <- Spectre::do.subsample(dat = Spectre::demo.start,
                                 targets = 10000)

# Subsample based on the sample with the smallest number of cells
sub.dat.sample <- Spectre::do.subsample(dat = Spectre::demo.start,
                                        divide.by = "FileName",
                                        min.per = TRUE)

sydneycytometry/Spectre documentation built on March 20, 2021, 2:15 a.m.