barcode_stat_test: Barcode Statistical Test

Description Usage Arguments Value Examples

View source: R/barcode_stat_test.R

Description

Carries out a specific instance of statistical testing relevant to clonal tracking experiments. For longitudinal observations (of barcode abundances) in the provided SE object, use a Chi-squared or Fisher exact test whether each barcode proportion has changed between samples.
Each column in the provided SE will be "tested" against the reference sample. If the 'stat_option' argument is set to its default of "subsequent" then each sample will be compared to the sample before it. If this argument is set to "reference" the reference sample column name must be provided and each column will be tested against that reference sample.

Usage

1
2
3
4
5
6
7
8
9
barcode_stat_test(
  your_SE,
  sample_size,
  stat_test = "chi-squared",
  stat_option = "subsequent",
  reference_sample = NULL,
  p_adjust = "none",
  bc_threshold = 0
)

Arguments

your_SE

A Summarized Experiment object containing clonal tracking data as created by the barcodetrackR 'create_SE' function.

sample_size

A numeric vector providing the sample size of each column of the SummarizedExperiment passed to the function. This sample size describes the samples that the barcoding data is meant to approximate, for example the number of cells barcodes were extracted from.

stat_test

The statistical test to use on the constructed contingency table for each barcode. Options are "chi-squared" and "fisher."
For information, see [chisq.test](https://www.rdocumentation.org/packages/stats/versions/3.6.2/topics/chisq.test) [fisher.test](https://www.rdocumentation.org/packages/stats/versions/3.6.2/topics/fisher.test)

stat_option

For "subsequent" statistical testing is performed on each column of data compared to the column before it. For "reference," all other columns of data are compared to a reference column specified in the 'reference_sample' arguument.

reference_sample

Provide the column name of the reference column if stat_option is set to "reference." Defaults to the first column in the SummarizedExperiment.

p_adjust

Character, default = "none". To correct p-values for muiltiple comparisons, set to any of the p value adjustment methods in the p.adjust function in R stats, which includes "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", and "fdr".

bc_threshold

Clones must be above this proportion in at least one sample to be included in statistical testing. Default is 0. Use this to ignore low-abundance clones which are more likely to be noise or artifact.

Value

Returns a list of 3 dataframes containing the following information for each observation (or barcode) which passed the provided bc_threshold:
[["FC"]], Fold Change of barcode abundance for each sample relative to the previous sample or to the specified reference sample. Please note that for maximal user control over results, the FC dataframe will contain 0 for barcodes where the test sample has an abundance of 0, Inf for barcodes where the reference sample had an abundance of 0 and NaN for a barcode where both the test and reference sample have an abundance of 0;
[["log_FC"]], same as previous but the log Fold Change. Please note that again for maximal user control, the log_FC dataframe will contain NaN values when the FC was Nan, -Inf values when the FC was 0, and Inf values when the FC was Inf;
[["p_val"]], the p-value returned from either the Chi-squared or Fisher exact test indicating whether each barcode changed in proportion between the test sample and the reference sample. Please note that the p value will be NaN if both abundances are 0, otherwise a p-value will be assigned.
Also, note that one column of each resulting dataframe will contain all NAs - in the case where the 'stat_option' argument is set to "subsequent" then this will be the first sample since there is no subsequent sample to compare to. In the case where the 'stat_option' argument is set to "reference" then the reference sample will contain NAs.

Examples

1
2
3
4
5
6
data(wu_subset)
barcode_stat_test(
    your_SE = wu_subset[, 1:4], sample_size = rep(5000, 4),
    stat_test = "chi-squared", stat_option = "subsequent",
    bc_threshold = 0.0001
)

d93espinoza/barcodetrackR documentation built on April 28, 2021, 1:58 p.m.