BANDITS_test-class: BANDITS_test class
In BANDITS: BANDITS: Bayesian ANalysis of DIfferenTial Splicing

Description Usage Arguments Value Slots Author(s) See Also Examples

BANDITS_test contains the results of the differential transcript usage (DTU) test. BANDITS_test is organized in three data.frames containing: gene-level results, transcript-level results and convergence diagnostics of the Markov chain Monte Carlo (MCMC) posterior chains. Created via test_DTU. To test for convergence, we use Heidelberger and Welch's convergence diagnostic, implemented in coda::heidel.diag, to test for the stationarity of the chain for the full log-posterior density; we use a 0.01 threshold on the p.value to reject the null hypotehsis of stationarity.

## S4 method for signature 'BANDITS_test'
show(object)

## S4 method for signature 'BANDITS_test'
convergence(x)

## S4 method for signature 'BANDITS_test'
top_genes(x, n = Inf, sort_by_g = "p.value")

## S4 method for signature 'BANDITS_test'
top_transcripts(x, n = Inf,
  sort_by_tr = "gene")

## S4 method for signature 'BANDITS_test'
gene(x, gene_id)

## S4 method for signature 'BANDITS_test'
transcript(x, transcript_id)

## S4 method for signature 'BANDITS_test'
plot_proportions(x, gene_id, CI = TRUE,
  CI_level = 0.95)

`object, x`	a 'BANDITS_test' object.
`n`	the number of genes or transcripts to report. By default `n = Inf` and all results will be reported.
`sort_by_g`	"p.value" for sorting results according to gene-level significance (i.e., p.value); "DTU_measure" for sorting results according to the 'DTU_measure' (check the vignette for details).
`sort_by_tr`	"gene" for sorting results according to gene-level significance (i.e., p.value); "transcript" for sorting results according to transcript-level significance (i.e., p.value).
`gene_id`	a character string or vector indicating the gene or genes whose results should be retrieved.
`transcript_id`	a character string or vector indicating the transcript or transcripts whose results should be retrieved.
`CI`	a logical element indicating whether to also plot confidence boundaries (TRUE) or not (FALSE).
`CI_level`	a number between 0 and 1, indicating the level of the confidence interval to plot.

show(object): prints the number of gene and transcript level results in the BANDITS_test object.
top_genes(x, n = Inf, sort_by_g = "p.value"): returns the gene-level results of the DTU test for the top 'n' significant genes. By default n = Inf and all results will be reported. sort_by_g = "gene" for sorting results according to gene-level significance; sort_by_g = "DTU_measure" for sorting results according to the 'DTU_measure'.
top_transcripts(x, n = Inf, sort_by_tr = "gene"): returns the transcript-level results of the DTU test for the top 'n' significant genes. By default n = Inf and all results will be reported. sort_by_tr = "gene" for sorting results according to gene-level significance; sort_by_tr = "transcript" for sorting results according to transcript-level significance.
convergence(x): returns the convergence diagnostic of the posterior MCMC chains for every gene.
gene(x, gene_id): returns a list with all results for the gene(s) specified in 'gene_id': gene results, corresponding transcript results and convergence diagnostic.
transcript(x, transcript_id): returns a list with all results for the trancript specified in 'transcript_id': transcript results, corresponding gene results and convergence diagnostic.
plot_proportions(x, gene_id, CI = TRUE, CI_level = 0.95): plots the posterior means of the average transcripts relative expression (i.e., the proportions) of each condition, for the gene specified in 'gene_id'. If 'CI' is TRUE, a profile Wald type confidence interval will also be plotted for each transcript estimated proportion; the level of the confidence interval is specified by 'CI_level'.

Gene_results

a data.frame containing the gene-level results of the DTU test, structured in the following columns:

Gene_id contains the gene names;
p.values is the gene-level p.values of the DTU test;
adj.p.values is the Benjamini-Hochberg adjusted p.values (via p.adjust);
p.values_inverted (only available for 2-group comparisons) is a conservative p.value, accounting for the inversion of the dominant transcript between conditions: p.values_inverted = p.values, if the dominant transcript varies between conditions, and p.values_inverted = sqrt( p.values ) if both conditions have the same dominant transcript;
adj.p.values_inverted (only available for 2-group comparisons) is the Benjamini-Hochberg adjusted p.values_inverted, via p.adjust;
DTU_measure (only available for 2-group comparisons) represents a measure of the intensity of changes between conditions. This measure ranges between 0, when proportions are identical between groups, and 2, when an isoform is always expressed in group A and a different transcript is always chosen in group B;
Mean log-prec "group_name" indicates the posterior mean of the logarithm of the Dirichlet precision parameter in group "group_name". The precision parameter models the degree of over-dispersion between samples: the higher the precision parameter (or its logarithm), the lower the sample-to-sample variability.
SD log-prec "group_name" indicates the standard deviation of the logarithm of the Dirichlet precision parameter in group "group_name".

Transcript_results

a data.frame containing the transcript-level results of the DTU test, structured in the following columns:

Gene_id contains the gene names;
Transcript_id contains the transcript names;
p.values is the transcript-level p.values of the DTU test;
adj.p.values is the Benjamini-Hochberg adjusted p.values (via p.adjust);
Max_Gene_Tr.p.val is a conservative p.value and represents the maximum between the transcript p.value and corresponding gene p.value;
Max_Gene_Tr.Adj.p.val is the Benjamini-Hochberg adjusted Max_Gene_Tr.p.val (via p.adjust);
Mean "group_name" indicates the posterior mean of the average relative abundance of the transcript in group "group_name" (an NaN value indicates that no data was available for a group to estimate parameters);
SD "group_name" indicates the standard deviation of the average relative abundance of the transcript in group "group_name" (an NaN value indicates that no data was available for a group to estimate parameters); this column indicates the variability in the mean estimate and is used to plot a Wald type confidence interval for the mean relative abundance via plot_proportions.

Convergence

a data.frame containing the convercence diagnostics of the DTU test, structured in the following columns:

Gene_id contains the gene names;
converged is 1 if convergence was reached, 0 otherwise;
burn_in indicates what fraction of the chain was removed to ensure convergence (excluding the burn_in parameter specified in test_DTU.

samples_design

a data.frame containing the design of the experiment, with one row for each sample and two columns with names 'sample_id' and 'group', specifying the id and group of each sample, respectively. It is provided by the user to test_DTU.

Simone Tiberi simone.tiberi@uzh.ch

test_DTU, create_data, BANDITS_data

# load the pre-computed results:
data("results", package = "BANDITS")
show(results)

# Visualize the most significant Genes, sorted by gene level significance.
head(top_genes(results))

# Alternatively, gene-level results can also be sorted according to DTU_measure, 
# which is a measure of the strength of the change between the 
# average relative abundances of the two groups.
head(top_genes(results, sort_by = "DTU_measure"))

# Visualize the most significant transcripts, sorted by transcript level significance.
head(top_transcripts(results, sort_by = "transcript"))

# Visualize the convergence output for the most significant genes, 
# sorted by gene level significance.
head(convergence(results))

# We can further use the 'gene' function to gather all output for a specific gene:
# gene level, transcript level and convergence results.
top_gene = top_genes(results, n = 1)
gene(results, top_gene$Gene_id)

# Similarly we can use the 'transcript function to gather all output 
# for a specific transcript.
top_transcript = top_transcripts(results, n = 1)
transcript(results, top_transcript$Transcript_id)

#Finally, we can plot the estimated average transcript relative expression 
# in the two groups for a specific gene via 'plot_proportions'.
plot_proportions(results, top_gene$Gene_id)