tof_downsample_prop: Downsample high-dimensional cytometry data by randomly...

View source: R/downsampling.R

tof_downsample_propR Documentation

Downsample high-dimensional cytometry data by randomly selecting a proportion of the cells in each group.

Description

This function downsamples the number of cells in a 'tof_tbl' by randomly selecting a 'prop_cells' proportion of the total number of cells with each unique combination of values in 'group_cols'.

Usage

tof_downsample_prop(tof_tibble, group_cols = NULL, prop_cells)

Arguments

tof_tibble

A 'tof_tbl' or a 'tibble'.

group_cols

Unquoted names of the columns in 'tof_tibble' that should be used to define groups from which 'prop_cells' will be downsampled. Supports tidyselect helpers. Defaults to 'NULL' (no grouping).

prop_cells

A proportion of cells (between 0 and 1) that should be sampled from each group defined by 'group_cols'.

Value

A 'tof_tbl' with the same number of columns as the input 'tof_tibble', but fewer rows. Specifically, the number of rows should be 'prop_cells' times the number of rows in the input 'tof_tibble'.

See Also

Other downsampling functions: tof_downsample(), tof_downsample_constant(), tof_downsample_density()

Examples

sim_data <-
    dplyr::tibble(
        cd45 = rnorm(n = 1000),
        cd38 = rnorm(n = 1000),
        cd34 = rnorm(n = 1000),
        cd19 = rnorm(n = 1000),
        cluster_id = sample(letters, size = 1000, replace = TRUE)
    )

# sample 10% of all cells from the input data
tof_downsample_prop(
    tof_tibble = sim_data,
    prop_cells = 0.1
)

# sample 10% of all cells from each cluster in the input data
tof_downsample_prop(
    tof_tibble = sim_data,
    group_cols = cluster_id,
    prop_cells = 0.1
)


keyes-timothy/tidytof documentation built on Aug. 28, 2024, 8:37 a.m.