tof_analyze_expression_ttest: Differential Expression Analysis (DEA) with t-tests

View source: R/differential_discovery.R

tof_analyze_expression_ttestR Documentation

Differential Expression Analysis (DEA) with t-tests

Description

This function performs differential expression analysis on the cell clusters contained within a 'tof_tbl' using simple t-tests. Specifically, either an unpaired or paired t-test will compare samples' marker expression distributions (between two conditions) within each cluster using a user-specified summary function (i.e. mean or median). One t-test is conducted per cluster/marker pair and significant differences between sample types are detected after multiple-hypothesis correction.

Usage

tof_analyze_expression_ttest(
  tof_tibble,
  cluster_col,
  marker_cols = where(tof_is_numeric),
  effect_col,
  group_cols,
  test_type = c("unpaired", "paired"),
  summary_function = mean,
  min_cells = 3,
  min_samples = 5,
  alpha = 0.05,
  quiet = FALSE
)

Arguments

tof_tibble

A 'tof_tbl' or a 'tibble'.

cluster_col

An unquoted column name indicating which column in 'tof_tibble' stores the cluster ids of the cluster to which each cell belongs. Cluster labels can be produced via any method the user chooses - including manual gating, any of the functions in the 'tof_cluster_*' function family, or any other method.

marker_cols

Unquoted column names representing which columns in 'tof_tibble' (i.e. which high-dimensional cytometry protein measurements) should be tested for differential expression between levels of the 'effect_col'. Defaults to all numeric (integer or double) columns. Supports tidyselect helpers.

effect_col

Unquoted column name representing which column in 'tof_tibble' should be used to break samples into groups for the t-test. Should only have 2 unique values.

group_cols

Unquoted names of the columns other than 'effect_col' that should be used to group cells into independent observations. Fills a similar role to 'sample_col' in other 'tof_analyze_abundance_*' functions. For example, if an experiment involves analyzing samples taken from multiple patients at two timepoints (with 'effect_col = timepoint'), then group_cols should be the name of the column representing patient IDs.

test_type

A string indicating whether the t-test should be "unpaired" (the default) or "paired".

summary_function

The vector-valued function that should be used to summarize the distribution of each marker in each cluster (within each sample, as grouped by 'group_cols'). Defaults to 'mean'.

min_cells

An integer value used to filter clusters out of the differential abundance analysis. Clusters are not included in the differential abundance testing if they do not have at least 'min_cells' in at least 'min_samples' samples. Defaults to 3.

min_samples

An integer value used to filter clusters out of the differential abundance analysis. Clusters are not included in the differential abundance testing if they do not have at least 'min_cells' in at least 'min_samples' samples. Defaults to 5.

alpha

A numeric value between 0 and 1 indicating which significance level should be applied to multiple-comparison adjusted p-values during the differential abundance analysis. Defaults to 0.05.

quiet

A boolean value indicating whether warnings should be printed. Defaults to 'TRUE'.

Value

A tibble with 7 columns:

{cluster_col}

The name/ID of the cluster in the cluster/marker pair being tested. Each entry in this column will match a unique value in the input {cluster_col}.

marker

The name of the marker in the cluster/marker pair being tested.

t

The t-statistic computed for each cluster.

df

The degrees of freedom used for the t-test for each cluster.

p_val

The (unadjusted) p-value for the t-test for each cluster.

p_adj

The p.adjust-adjusted p-value for the t-test for each cluster.

significant

A character vector that will be "*" for clusters for which p_adj < alpha and "" otherwise.

mean_diff

For an unpaired t-test, the difference between the average proportions of each cluster in the two levels of 'effect_col'. For a paired t-test, the average difference between the proportions of each cluster in the two levels of 'effect_col' within a given patient.

mean_fc

For an unpaired t-test, the ratio between the average proportions of each cluster in the two levels of 'effect_col'. For a paired t-test, the average ratio between the proportions of each cluster in the two levels of 'effect_col' within a given patient. 0.001 is added to the denominator of the ratio to avoid divide-by-zero errors.

The "levels" attribute of the result indicates the order in which the different levels of the 'effect_col' were considered. The 'mean_diff' value for each row of the output is computed subtracting the second level from the first level, and the 'mean_fc' value for each row is computed by dividing the first level by the second level.

See Also

Other differential expression analysis functions: tof_analyze_expression(), tof_analyze_expression_diffcyt(), tof_analyze_expression_lmm()

Examples

# For differential discovery examples, please see the package vignettes
NULL


keyes-timothy/tidytof documentation built on Aug. 28, 2024, 8:37 a.m.