tof_downsample_density | R Documentation |
This function downsamples the number of cells in a 'tof_tbl' using the density-dependent downsampling algorithm described in Qiu et al., (2011).
tof_downsample_density(
tof_tibble,
group_cols = NULL,
density_cols = where(tof_is_numeric),
target_num_cells,
target_prop_cells,
target_percentile = 0.03,
outlier_percentile = 0.01,
distance_function = c("euclidean", "cosine", "l2", "ip"),
density_estimation_method = c("mean_distance", "sum_distance", "spade"),
...
)
tof_tibble |
A 'tof_tbl' or a 'tibble'. |
group_cols |
Unquoted names of the columns in 'tof_tibble' that should be used to define groups within which the downsampling will be performed. Supports tidyselect helpers. Defaults to 'NULL' (no grouping). |
density_cols |
Unquoted names of the columns in 'tof_tibble' to use in the density estimation for each cell. Defaults to all numeric columns in 'tof_tibble'. |
target_num_cells |
An approximate constant number of cells (between 0 and 1) that should be sampled from each group defined by 'group_cols'. Slightly more or fewer cells may be returned due to how the density calculation is performed. |
target_prop_cells |
An approximate proportion of cells (between 0 and 1) that should be sampled from each group defined by 'group_cols'. Slightly more or fewer cells may be returned due to how the density calculation is performed. Ignored if 'target_num_cells' is specified. |
target_percentile |
The local density percentile (i.e. a value between 0 and 1) to which the downsampling procedure should adjust all cells. In short, the algorithm will continue to remove cells from the input 'tof_tibble' until the local densities of all remaining cells is equal to 'target_percentile'. Lower values will result in more cells being removed. See Qiu et al., (2011) for details. Defaults to 0.1 (the 10th percentile of local densities). Ignored if either 'target_num_cells' or 'target_prop_cells' are specified. |
outlier_percentile |
The local density percentile (i.e. a value between 0 and 1) below which cells should be considered outliers (and discarded). Cells with a local density below 'outlier_percentile' will never be selected during the downsampling procedure. Defaults to 0.01 (cells below the 1st local density percentile will be removed). |
distance_function |
A string indicating which distance function to use for the cell-to-cell distance calculations. Options include "euclidean" (the default) and "cosine" distances. |
density_estimation_method |
A string indicating which algorithm should be used to calculate the local density estimate for each cell. Options include k-nearest neighbor density estimation using the mean distance to a cell's k-nearest neighbors ("mean_distance"; the default), k-nearest neighbor density estimation using the summed distance to a cell's k nearest neighbors ("sum_distance") and counting the number of neighboring cells within a spherical radius around each cell as described in Qiu et al., 2011 ("spade"). While "spade" often produces the best results, it is slower than knn-density estimation methods. |
... |
Optional additional arguments to pass to
|
A 'tof_tbl' with the same number of columns as the input 'tof_tibble', but fewer rows. The number of rows will depend on the chosen value of 'target_percentile', with fewer cells selected with lower values of 'target_percentile'.
Other downsampling functions:
tof_downsample()
,
tof_downsample_constant()
,
tof_downsample_prop()
sim_data <-
dplyr::tibble(
cd45 = rnorm(n = 1000),
cd38 = rnorm(n = 1000),
cd34 = rnorm(n = 1000),
cd19 = rnorm(n = 1000)
)
tof_downsample_density(
tof_tibble = sim_data,
density_cols = c(cd45, cd34, cd38),
target_prop_cells = 0.5,
density_estimation_method = "spade"
)
tof_downsample_density(
tof_tibble = sim_data,
density_cols = c(cd45, cd34, cd38),
target_num_cells = 200L,
density_estimation_method = "spade"
)
tof_downsample_density(
tof_tibble = sim_data,
density_cols = c(cd45, cd34, cd38),
target_num_cells = 200L,
density_estimation_method = "mean_distance"
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.