test_DTU: Perform differential splicing

View source: R/run.R

test_DTUR Documentation

Perform differential splicing


test_DTU performs differential splicing, via differential transcript usage (DTU), between 2 or more groups. Parameters are inferred via Markov chain Monte Carlo (MCMC) techniques and a DTU test is performed via a multivariate Wald test on the posterior densities for the average relative abundance of transcripts. Warning: the samples in samples_design must have the same order as those in the 'path_to_eq_classes' parameter of the create_data function.


  precision = NULL,
  R = 10^4,
  burn_in = 2 * 10^3,
  group_col_name = "group",
  n_cores = 1,
  theshold_pval = 0.1



a 'BANDITS_data' object.


a vector with the mean and standard deviation of the log-precision parameter.


the number of iterations for the MCMC algorithm (after the burn-in). Min 10^4. Albeit no difference was observed in simulation studies when increasing 'R' above 10^4, we encourage users to possibly use higher values of R (e.g., 2*10^4), if the computational time allows it, particularly for comparisons between 3 or more groups.


the length of the burn-in to be discarded (before convergence is reached). Min 2*10^3. Albeit no difference was observed in simulation studies when increasing 'burn_in' above 2*10^3, we encourage users to possibly use higher values of R (e.g., double) if the computational time allows it.


a data.frame indicating the design of the experiment with one row for each sample: samples_design must contain a column with the sample id and one with the group id. Warning: the samples in samples_design must have the same order as those in the 'path_to_eq_classes' parameter of the create_data function.


the name of the column of 'samples_design' containing the group id. By default group_col_name = "group".


the number of cores to parallelize the tasks on.


a matrix or data.frame with a list of gene-to-transcript correspondances. The first column represents the gene id, while the second one contains the transcript id.


is a threshold between 0 and 1; when running test_DTU, if the p.value of a gene is < theshold_pval, a second (independent) MCMC chain is run and the p.value is re-computed on the aggregation of the two chains. By defauls theshold_pval = 0.1, while theshold_pval = 1 corresponds to running all chains twice, and theshold_pval = 0 means all chains will only run once.


A BANDITS_test object.


Simone Tiberi simone.tiberi@uzh.ch

See Also

create_data, BANDITS_data, BANDITS_test


# load gene_to_transcript matching:
data("gene_tr_id", package = "BANDITS")

# We define the design of the study
samples_design = data.frame(sample_id = paste0("sample", seq_len(4)),
                            group = c("A", "A", "B", "B"))

# load the pre-computed data:
data("input_data", package = "BANDITS")

# Filter lowly abundant genes:
input_data = filter_genes(input_data, min_counts_per_gene = 20)

# load the pre-computed precision estimates:
data(precision, package = "BANDITS")

## Test for DTU
results = test_DTU(BANDITS_data = input_data,
                   precision = precision$prior,
                   samples_design = samples_design,
                   R = 10^4, burn_in = 2*10^3, n_cores = 2,
                   gene_to_transcript = gene_tr_id)

SimoneTiberi/BANDITS documentation built on Nov. 15, 2023, 2:35 p.m.