tof_downsample_constant: Downsample high-dimensional cytometry data by randomly...

View source: R/downsampling.R

tof_downsample_constantR Documentation

Downsample high-dimensional cytometry data by randomly selecting a constant number of cells per group.

Description

This function downsamples the number of cells in a 'tof_tbl' by randomly selecting 'num_cells' cells from each unique combination of values in 'group_cols'.

Usage

tof_downsample_constant(tof_tibble, group_cols = NULL, num_cells)

Arguments

tof_tibble

A 'tof_tbl' or a 'tibble'.

group_cols

Unquoted names of the columns in 'tof_tibble' that should be used to define groups from which 'num_cells' will be downsampled. Supports tidyselect helpers. Defaults to 'NULL' (no grouping).

num_cells

An integer number of cells that should be sampled from each group defined by 'group_cols'.

Value

A 'tof_tbl' with the same number of columns as the input 'tof_tibble', but fewer rows. Specifically, the number of rows will be 'num_cells' multiplied by the number of unique combinations of the values in 'group_cols'. If any group has fewer than 'num_cells' number of cells, all cells from that group will be kept.

See Also

Other downsampling functions: tof_downsample(), tof_downsample_density(), tof_downsample_prop()

Examples

sim_data <-
    dplyr::tibble(
        cd45 = rnorm(n = 1000),
        cd38 = rnorm(n = 1000),
        cd34 = rnorm(n = 1000),
        cd19 = rnorm(n = 1000),
        cluster_id = sample(letters, size = 1000, replace = TRUE)
    )

# sample 500 cells from the input data
tof_downsample_constant(
    tof_tibble = sim_data,
    num_cells = 500L
)

# sample 20 cells per cluster from the input data
tof_downsample_constant(
    tof_tibble = sim_data,
    group_cols = cluster_id,
    num_cells = 20L
)


keyes-timothy/tidytof documentation built on Aug. 28, 2024, 8:37 a.m.