test_differential_abundance-methods: Perform differential transcription testing using edgeR QLT,...

Description Usage Arguments Details Value Examples

Description

test_differential_abundance() takes as input a 'tbl' formatted as | <SAMPLE> | <TRANSCRIPT> | <COUNT> | <...> | and returns a 'tbl' with additional columns for the statistics from the hypothesis test.

Usage

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
test_differential_abundance(
  .data,
  .formula,
  .sample = NULL,
  .transcript = NULL,
  .abundance = NULL,
  .contrasts = NULL,
  method = "edgeR_quasi_likelihood",
  scaling_method = "TMM",
  omit_contrast_in_colnames = FALSE,
  prefix = "",
  action = "add",
  significance_threshold = NULL,
  fill_missing_values = NULL
)

## S4 method for signature 'spec_tbl_df'
test_differential_abundance(
  .data,
  .formula,
  .sample = NULL,
  .transcript = NULL,
  .abundance = NULL,
  .contrasts = NULL,
  method = "edgeR_quasi_likelihood",
  scaling_method = "TMM",
  omit_contrast_in_colnames = FALSE,
  prefix = "",
  action = "add",
  significance_threshold = NULL,
  fill_missing_values = NULL
)

## S4 method for signature 'tbl_df'
test_differential_abundance(
  .data,
  .formula,
  .sample = NULL,
  .transcript = NULL,
  .abundance = NULL,
  .contrasts = NULL,
  method = "edgeR_quasi_likelihood",
  scaling_method = "TMM",
  omit_contrast_in_colnames = FALSE,
  prefix = "",
  action = "add",
  significance_threshold = NULL,
  fill_missing_values = NULL
)

## S4 method for signature 'tidybulk'
test_differential_abundance(
  .data,
  .formula,
  .sample = NULL,
  .transcript = NULL,
  .abundance = NULL,
  .contrasts = NULL,
  method = "edgeR_quasi_likelihood",
  scaling_method = "TMM",
  omit_contrast_in_colnames = FALSE,
  prefix = "",
  action = "add",
  significance_threshold = NULL,
  fill_missing_values = NULL
)

## S4 method for signature 'SummarizedExperiment'
test_differential_abundance(
  .data,
  .formula,
  .sample = NULL,
  .transcript = NULL,
  .abundance = NULL,
  .contrasts = NULL,
  method = "edgeR_quasi_likelihood",
  scaling_method = "TMM",
  omit_contrast_in_colnames = FALSE,
  prefix = "",
  action = "add",
  significance_threshold = NULL,
  fill_missing_values = NULL
)

## S4 method for signature 'RangedSummarizedExperiment'
test_differential_abundance(
  .data,
  .formula,
  .sample = NULL,
  .transcript = NULL,
  .abundance = NULL,
  .contrasts = NULL,
  method = "edgeR_quasi_likelihood",
  scaling_method = "TMM",
  omit_contrast_in_colnames = FALSE,
  prefix = "",
  action = "add",
  significance_threshold = NULL,
  fill_missing_values = NULL
)

Arguments

.data

A 'tbl' formatted as | <SAMPLE> | <TRANSCRIPT> | <COUNT> | <...> |

.formula

A formula with no response variable, representing the desired linear model

.sample

The name of the sample column

.transcript

The name of the transcript/gene column

.abundance

The name of the transcript/gene abundance column

.contrasts

This parameter takes the shape of the contrast parameter of the method of choice. For edgeR and limma-voom is a character vector. For DESeq2 is a list including a character vectors of length three. If contrasts are not present the first covariate is the one the model is tested against (e.g., ~ factor_of_interest)

method

A string character. Either "edgeR_quasi_likelihood" (i.e., QLF), "edgeR_likelihood_ratio" (i.e., LRT), "DESeq2", "limma_voom"

scaling_method

A character string. The scaling method passed to the back-end function (i.e., edgeR::calcNormFactors; "TMM","TMMwsp","RLE","upperquartile")

omit_contrast_in_colnames

If just one contrast is specified you can choose to omit the contrast label in the colnames.

prefix

A character string. The prefix you would like to add to the result columns. It is useful if you want to compare several methods.

action

A character string. Whether to join the new information to the input tbl (add), or just get the non-redundant tbl with the new information (get).

significance_threshold

A real between 0 and 1 (usually 0.05).

fill_missing_values

A boolean. Whether to fill missing sample/transcript values with the median of the transcript. This is rarely needed.

Details

\lifecycle

maturing

This function provides the option to use edgeR https://doi.org/10.1093/bioinformatics/btp616, limma-voom https://doi.org/10.1186/gb-2014-15-2-r29, or DESeq2 https://doi.org/10.1186/s13059-014-0550-8 to perform the testing. All methods use raw counts, irrespective of if scale_abundance or adjust_abundance have been calculated, therefore it is essential to add covariates such as batch effects (if applicable) in the formula.

Underlying method for edgeR framework: .data

# Filter keep_abundant( factor_of_interest = !!(as.symbol(parse_formula(.formula)[1])), minimum_counts = minimum_counts, minimum_proportion = minimum_proportion )

# Format select(!!.transcript,!!.sample,!!.abundance) spread(!!.sample,!!.abundance) as_matrix(rownames = !!.transcript)

# edgeR edgeR::DGEList(counts = .) edgeR::calcNormFactors(method = scaling_method) edgeR::estimateDisp(design)

# Fit edgeR::glmQLFit(design) edgeR::glmQLFTest(coef = 2, contrast = my_contrasts) // or glmLRT according to choice

Underlying method for DESeq2 framework: keep_abundant( factor_of_interest = !!as.symbol(parse_formula(.formula)[[1]]), minimum_counts = minimum_counts, minimum_proportion = minimum_proportion )

# DESeq2 DESeq2::DESeqDataSet( design = .formula) DESeq2::DESeq() DESeq2::results()

Value

A 'tbl' with additional columns for the statistics from the test (e.g., log fold change, p-value and false discovery rate).

A 'tbl' with additional columns for the statistics from the test (e.g., log fold change, p-value and false discovery rate).

A 'tbl' with additional columns for the statistics from the hypothesis test (e.g., log fold change, p-value and false discovery rate).

A 'tbl' with additional columns for the statistics from the hypothesis test (e.g., log fold change, p-value and false discovery rate).

A 'SummarizedExperiment' object

A 'SummarizedExperiment' object

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
 tidybulk::counts_mini %>% 
 tidybulk(sample, transcript, count) %>% 
 identify_abundant() %>% 
	test_differential_abundance( ~ condition )

	# The function `test_differential_abundance` operates with contrasts too

 tidybulk::counts_mini %>%
 tidybulk(sample, transcript, count) %>%
 identify_abundant() %>%
 test_differential_abundance(
	    ~ 0 + condition,
	    .contrasts = c( "conditionTRUE - conditionFALSE")
 )

tidybulk documentation built on April 7, 2021, 6 p.m.