  collapse = TRUE,
  comment = "#>"
ragg_png = function(..., res = 192) {
  ragg::agg_png(..., res = res, units = "in")
knitr::opts_chunk$set(dev = "ragg_png", fig.ext = "png")


This vignette is intended to describe how to create a Shiny server to display interactive Sashimi plots with RNA-seq data.

Splicejam uses memoise to cache data, so it is recommended to start the Splicejam Shiny app in its own directory. Each type of data is cached in its own sub-directory with "_memoise" in the name, so you can recognize and delete these directories to clear the cache as needed.

Data Requirements

There are two basic requirements for a Sashimi plot:

  1. Gene-exon structure, usually provided by a GTF file.
  2. Source data, sequence coverage and junction read counts:

  3. RNA-seq coverage data in bigWig files.

  4. Splice junction data in BED files or STAR "" files.



# define files with coverage and junctions
filesDF <- data.frame(sample_id=c("sample_A", "sample_A"),
   url=c("https://server/", "https://server/sample_A/")
   type=c("bw", "junction"));

# provide path to genes GTF
gtf <- "path/to/genes.gtf"

# launch Splicejam Shiny app

Gene-exon structure using GTF file

Ideally, the GTF used by splicejam will be the same GTF file used in upstream processing, for example with Salmon, Kallisto, or featureCounts. The benefit is that the GTF will display gene-exon structure consistent with your overall analysis work.

That said, any GTF file for your genome will work fine. The GTF is used to determine gene-exon structure, and to display transcript isoforms per gene. It then displays RNA-seq coverage and junction reads over these exons.

Splicejam will derive several objects from this GTF file:

Most of the workflow was designed for the Gencode GTF format, which includes "gene_name" with gene symbols, and "transcript_id" with transcript identifiers.

Other GTF files should work even if they have custom gene and transcript attributes. When in doubt, use a Gencode GTF file.

Commonly used GTF files by organism

Some GTF files that have been tested and confirmed with Splicejam are listed below.

## Mouse mm10 as used by Farris et al

## Mouse mm10

## Human hg38

## Human hg19

Example using Gencode for hg19

Simple enough, just assign the file path to gtf:

gtf <- "";

Source data

Source data with sequence coverage and junction read counts, is supplied as a data.frame with these columns:

When there are multiple replicate files for a sample_id, the scores are added together and the total score is displayed in the sashimi plot for each sample_id. When scale_factor is also defined, it is applied to each file before values are summed across sample_id replicates. In this way, files can be individually normalized as needed.

Example filesDF data.frame

An example of filesDF is provided in the R package "farrisdata".

if (jamba::check_pkg_installed("farrisdata")) {

RNA-seq coverage data

RNA-seq coverage is provided as bigWig files, and can be accessed via HTTP web hyperlink, or a direct file path.

Splice junction data

Splice junctions are provided in one of two formats:

  1. BED format: We use BED12 format, but it can be direct BED format as well. The BED12 format is used with two 1-base alignments and a gap between them to indicate the splice junction itself. When all BED names are numeric, the BED name is used as the junction score, otherwise the BED score is used. The reason is that bigBed format does not allow scores higher than 1000, so we encode scores in the name field.
  2. STAR "" format: a tab-delimited file produced by the STAR alignment tool. This file must have 9 columns, and column 7 is used to define read counts because it contains uniquely mapped reads. Junctions with zero uniquely mapped reads are removed.

Quick start using Farris data

As a positive control for the Splicejam Shiny server, you can use the Farris data that supports Farris et al (2019).

if (!jamba::check_pkg_installed("farrisdata")) {


Note this workflow will use filesDF from the "farrisdata" package, and will download Gencode mouse GTF used for that publication. It takes about 3 minutes to prepare data and create the first Shiny plot for the gene Gria1.

This workflow will by default populate the R environment globalenv() with variables used in the farrisdata Splicejam Shiny app. See Advanced Options for ways to specify a new environment.

Quick start with GTF and filesDF


# filesDF should already be defined
filesDF2 <- subset(farrisdata::farris_sashimi_files_df,
   sample_id %in% c("CA1_DE", "CA2_CB"));

# gtf should be a file path or web URL
gtf <- "";

launchSashimiApp(filesDF=filesDF2, gtf=gtf)

This workflow will by default populate the R environment globalenv() with variables used in the farrisdata Splicejam Shiny app. See Advanced Options for ways to specify a new environment.

Useful custom options

The following options can be invoked by defining the variable name inside the enrivonment used for the Splicejam Shiny app. By default, this environment is globalenv(), however the examples below show how to use a custom environment. The advantage of using an environment is that the data contained inside is not copied in memory during function calls, and can be shared by the Shiny UI and Shiny Server.

For example, to define detectedTx you would simply assign a value in the globalenv():

detectedTx <- rownames(tx2geneDF);

Or if using a custom environment:

splicejam_env <- new.env();

Advanced options

Start the Shiny app on a different port

One of the most common ways to set up a Shiny server is to run it on a custom port, and listen to a specific address.

Note in the example below, host="" will instruct the Shiny app to respond to requests directed at any host or IP address. If you used host="" the server would only respond to requests specific to and would not respond to requests to https://localhost:8080.


Specify a specific R environment for Splicejam data

By default the data preparation uses the global environment defined by globalenv(). This process will create objects in your user session, and will update those objects during the preparation step.

However, you can create a custom environment to keep the data encapsulated, and separate from your user session.

It is intended to be straightforward to use a custom environment. First define a new environment.


splicejam_env <- new.env();

# filesDF should already be defined
filesDF2 <- subset(farrisdata::farris_sashimi_files_df,
   sample_id %in% c("CA1_DE", "CA2_CB"));

# gtf should be a file path or web URL
gtf <- "";


Specify gene-exon data without using a GTF file

At its core, splicejam derives the data it needs from the GTF data, but it can be supplied with this data directly, avoiding the need for a GTF file.

More detail will be added here in future, but for now see the vignette "Create a Sashimi Plot".

jmw86069/splicejam documentation built on Dec. 19, 2024, 5:25 p.m.