test_DTU: Perform differential splicing
In SimoneTiberi/BANDITS: BANDITS: Bayesian ANalysis of DIfferenTial Splicing

test_DTU

R Documentation

Perform differential splicing

Description

test_DTU performs differential splicing, via differential transcript usage (DTU), between 2 or more groups. Parameters are inferred via Markov chain Monte Carlo (MCMC) techniques and a DTU test is performed via a multivariate Wald test on the posterior densities for the average relative abundance of transcripts. Warning: the samples in samples_design must have the same order as those in the 'path_to_eq_classes' parameter of the create_data function.

Usage

test_DTU(
  BANDITS_data,
  precision = NULL,
  R = 10^4,
  burn_in = 2 * 10^3,
  samples_design,
  group_col_name = "group",
  n_cores = 1,
  gene_to_transcript,
  theshold_pval = 0.1
)

Arguments

`BANDITS_data`	a 'BANDITS_data' object.
`precision`	a vector with the mean and standard deviation of the log-precision parameter.
`R`	the number of iterations for the MCMC algorithm (after the burn-in). Min 10^4. Albeit no difference was observed in simulation studies when increasing 'R' above 10^4, we encourage users to possibly use higher values of R (e.g., 2*10^4), if the computational time allows it, particularly for comparisons between 3 or more groups.
`burn_in`	the length of the burn-in to be discarded (before convergence is reached). Min 210^3. Albeit no difference was observed in simulation studies when increasing 'burn_in' above 210^3, we encourage users to possibly use higher values of R (e.g., double) if the computational time allows it.
`samples_design`	a `data.frame` indicating the design of the experiment with one row for each sample: samples_design must contain a column with the sample id and one with the group id. Warning: the samples in samples_design must have the same order as those in the 'path_to_eq_classes' parameter of the `create_data` function.
`group_col_name`	the name of the column of 'samples_design' containing the group id. By default group_col_name = "group".
`n_cores`	the number of cores to parallelize the tasks on.
`gene_to_transcript`	a matrix or data.frame with a list of gene-to-transcript correspondances. The first column represents the gene id, while the second one contains the transcript id.
`theshold_pval`	is a threshold between 0 and 1; when running `test_DTU`, if the p.value of a gene is < theshold_pval, a second (independent) MCMC chain is run and the p.value is re-computed on the aggregation of the two chains. By defauls theshold_pval = 0.1, while theshold_pval = 1 corresponds to running all chains twice, and theshold_pval = 0 means all chains will only run once.

Value

A BANDITS_test object.

Author(s)

Simone Tiberi simone.tiberi@uzh.ch

Examples

# load gene_to_transcript matching:
data("gene_tr_id", package = "BANDITS")

# We define the design of the study
samples_design = data.frame(sample_id = paste0("sample", seq_len(4)),
                            group = c("A", "A", "B", "B"))

# load the pre-computed data:
data("input_data", package = "BANDITS")
input_data

# Filter lowly abundant genes:
input_data = filter_genes(input_data, min_counts_per_gene = 20)

# load the pre-computed precision estimates:
data(precision, package = "BANDITS")

## Test for DTU
set.seed(61217)
results = test_DTU(BANDITS_data = input_data,
                   precision = precision$prior,
                   samples_design = samples_design,
                   R = 10^4, burn_in = 2*10^3, n_cores = 2,
                   gene_to_transcript = gene_tr_id)
results

SimoneTiberi/BANDITS documentation built on Nov. 15, 2023, 2:35 p.m.