View source: R/quality_control.R
tof_assess_flow_rate | R Documentation |
This function performs a simplified version of flowAI's statistical test to detect time periods with abnormal flow rates over the course of a flow cytometry experiment. Briefly, the relative flow rates for each timestep throughout data acquisition are calculated (see tof_calculate_flow_rate), and outlier timepoints with particularly high or low flow rates (i.e. those beyond extreme values of the t-distribution across timesteps) are flagged.
tof_assess_flow_rate(
tof_tibble,
time_col,
group_cols,
num_timesteps = nrow(tof_tibble)/1000,
alpha_threshold = 0.01,
visualize = FALSE,
...,
augment = FALSE
)
tof_tibble |
A 'tof_tbl' or 'tibble'. |
time_col |
An unquoted column name indicating which column in 'tof_tibble' contains the time at which each cell was collected. |
group_cols |
Optional. Unquoted column names indicating which columns should be used to group cells before analysis. Flow rate calculation is then performed independently within each group. Supports tidyselect helpers. |
num_timesteps |
The number of bins into which 'time_col' should be split. to define "timesteps" of the data collection process. The number of cells analyzed by the cytometer will be counted in each bin separately and will represent the relative average flow rate for that timestep in data collection. |
alpha_threshold |
A scalar between 0 and 1 indicating the two-tailed significance level at which to draw outlier thresholds in the t-distribution with 'num_timesteps' - 1 degrees of freedom. Defaults to 0.01. |
visualize |
A boolean value indicating if a plot should be generated to visualize each timestep's relative flow rate (by group) instead of returning the tibble directly. Defaults to FALSE. |
... |
Optional additional arguments to pass to |
augment |
A boolean value indicating if the output should column-bind the computed flags for each cell (see below) as new columns in 'tof_tibble' (TRUE) or if a tibble including only the computed flags should be returned (FALSE, the default). |
A tibble with the same number of rows as 'tof_tibble'. If augment = FALSE (the default), it will have 3 columns: "{time_col}" (the same column as 'time_col'), "timestep" (the numeric timestep to which each cell was assigned based on its value for 'time_col'), and "flagged_window" (a boolean vector indicating if each cell was collecting during a timestep flagged for having a high or low flow rate). If augment = TRUE, these 3 columns will be column-bound to 'tof_tibble' to return an augmented version of the input dataset. (Note that in this case, time_col will not be duplicated). If visualize = TRUE, then a ggplot object is returned instead of a tibble.
set.seed(1000L)
sim_data <-
data.frame(
cd4 = rnorm(n = 1000, mean = 5, sd = 0.5),
cd8 = rnorm(n = 1000, mean = 0, sd = 0.1),
cd33 = rnorm(n = 1000, mean = 10, sd = 0.1),
file_name = c(rep("a", times = 500), rep("b", times = 500)),
time =
c(
sample(1:100, size = 200, replace = TRUE),
sample(100:400, size = 300, replace = TRUE),
sample(1:150, size = 400, replace = TRUE),
sample(1:500, size = 100, replace = TRUE)
)
)
sim_data |>
tof_assess_flow_rate(
time_col = time,
num_timesteps = 20,
visualize = TRUE
)
sim_data |>
tof_assess_flow_rate(
time_col = time,
group_cols = file_name,
num_timesteps = 20,
visualize = TRUE
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.