View source: R/differential_discovery.R
tof_analyze_expression_ttest | R Documentation |
This function performs differential expression analysis on the cell clusters contained within a 'tof_tbl' using simple t-tests. Specifically, either an unpaired or paired t-test will compare samples' marker expression distributions (between two conditions) within each cluster using a user-specified summary function (i.e. mean or median). One t-test is conducted per cluster/marker pair and significant differences between sample types are detected after multiple-hypothesis correction.
tof_analyze_expression_ttest(
tof_tibble,
cluster_col,
marker_cols = where(tof_is_numeric),
effect_col,
group_cols,
test_type = c("unpaired", "paired"),
summary_function = mean,
min_cells = 3,
min_samples = 5,
alpha = 0.05,
quiet = FALSE
)
tof_tibble |
A 'tof_tbl' or a 'tibble'. |
cluster_col |
An unquoted column name indicating which column in 'tof_tibble' stores the cluster ids of the cluster to which each cell belongs. Cluster labels can be produced via any method the user chooses - including manual gating, any of the functions in the 'tof_cluster_*' function family, or any other method. |
marker_cols |
Unquoted column names representing which columns in 'tof_tibble' (i.e. which high-dimensional cytometry protein measurements) should be tested for differential expression between levels of the 'effect_col'. Defaults to all numeric (integer or double) columns. Supports tidyselect helpers. |
effect_col |
Unquoted column name representing which column in 'tof_tibble' should be used to break samples into groups for the t-test. Should only have 2 unique values. |
group_cols |
Unquoted names of the columns other than 'effect_col' that should be used to group cells into independent observations. Fills a similar role to 'sample_col' in other 'tof_analyze_abundance_*' functions. For example, if an experiment involves analyzing samples taken from multiple patients at two timepoints (with 'effect_col = timepoint'), then group_cols should be the name of the column representing patient IDs. |
test_type |
A string indicating whether the t-test should be "unpaired" (the default) or "paired". |
summary_function |
The vector-valued function that should be used to summarize the distribution of each marker in each cluster (within each sample, as grouped by 'group_cols'). Defaults to 'mean'. |
min_cells |
An integer value used to filter clusters out of the differential abundance analysis. Clusters are not included in the differential abundance testing if they do not have at least 'min_cells' in at least 'min_samples' samples. Defaults to 3. |
min_samples |
An integer value used to filter clusters out of the differential abundance analysis. Clusters are not included in the differential abundance testing if they do not have at least 'min_cells' in at least 'min_samples' samples. Defaults to 5. |
alpha |
A numeric value between 0 and 1 indicating which significance level should be applied to multiple-comparison adjusted p-values during the differential abundance analysis. Defaults to 0.05. |
quiet |
A boolean value indicating whether warnings should be printed. Defaults to 'TRUE'. |
A tibble with 7 columns:
The name/ID of the cluster in the cluster/marker pair being tested. Each entry in this column will match a unique value in the input {cluster_col}.
The name of the marker in the cluster/marker pair being tested.
The t-statistic computed for each cluster.
The degrees of freedom used for the t-test for each cluster.
The (unadjusted) p-value for the t-test for each cluster.
The p.adjust
-adjusted p-value for the t-test for each cluster.
A character vector that will be "*" for clusters for which p_adj < alpha and "" otherwise.
For an unpaired t-test, the difference between the average proportions of each cluster in the two levels of 'effect_col'. For a paired t-test, the average difference between the proportions of each cluster in the two levels of 'effect_col' within a given patient.
For an unpaired t-test, the ratio between the average proportions of each cluster in the two levels of 'effect_col'. For a paired t-test, the average ratio between the proportions of each cluster in the two levels of 'effect_col' within a given patient. 0.001 is added to the denominator of the ratio to avoid divide-by-zero errors.
The "levels" attribute of the result indicates the order in which the different levels of the 'effect_col' were considered. The 'mean_diff' value for each row of the output is computed subtracting the second level from the first level, and the 'mean_fc' value for each row is computed by dividing the first level by the second level.
Other differential expression analysis functions:
tof_analyze_expression()
,
tof_analyze_expression_diffcyt()
,
tof_analyze_expression_lmm()
# For differential discovery examples, please see the package vignettes
NULL
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.