options(tinytex.verbose = TRUE) knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
VaSP is an R package for discovery of genome-wide variable splicing events from short-read RNA-seq data. Based on R package Ballgown, VaSP calculates Single Splicing Strength (3S) score of any intron by the junction count normalized by the gene-level average read coverage (score=junction count/gene-level average read coverage). The 3S scores can be used for further analysis, such as differential splicing analysis between two sample groups and sQTL (splicing Quantitative Trait Locus) identification in a large population. The VaSP package provides a function to find large-effect differential splicing events without the need of genotypic information in an inbred plant population, so called genotype-specific splicing (GSS). Integrated with functions from R package Sushi, VaSP package also provides function to visualization of gene splicing information for publication-quality multi-panel figures.
Start R and run:
if (!requireNamespace("BiocManager", quietly=TRUE)) install.packages("BiocManager") BiocManager::install("vasp")
Users need to follow the manual of R package Ballgown (https://github.com/alyssafrazee/ballgown) to creat a ballgown object as an input for the VaSP package. See ?ballgown
for detailed information on creating Ballgown objects. The object can be stored in a .RDate
file by save()
.
library(ballgown) ?ballgown path<-system.file('extdata', package='vasp') rice.bg<-ballgown(samples = list.dirs(path = path,recursive = F) )
Calculate 3S (Single Splicing Strength) scores, find GSS (genotype-specific splicing) events and display the splicing information.
library(vasp) data(rice.bg) ?rice.bg rice.bg score<-spliceGene(rice.bg, gene="MSTRG.183", junc.type = "score") tail(round(score,2),2)
gss <- BMfinder(score, cores = 1) gss
gss_intron<-structure(rice.bg)$intron (gss_intron<-gss_intron[gss_intron$id%in%rownames(gss)]) range(gss_intron)
splicePlot(rice.bg,gene='MSTRG.183',samples = sampleNames(rice.bg)[c(1,3,5)], start = 1179000, end = 1179300)
Currently, there are 6 functions in VaSP:
getDepth: Get read depth from a BAM file (in bedgraph format)
getGeneinfo: Get gene informaton from a ballgown object
spliceGene: Calculate 3S scores for one gene
spliceGenome: Calculate genome-wide splicing scores
BMfinder: Discover bimodal distrubition features
splicePlot: Visualization of read coverage, splicing information and gene information in a gene region
Get read depth from a BAM file (in bedgraph format) and return a data.frame in bedgraph file format which can be used as input for plotBedgraph
in the SuShi package.
path <- system.file("extdata", package = "vasp") bam_files <- list.files(path, "*.bam$") bam_files depth <- getDepth(file.path(path, bam_files[1]), "Chr1", start = 1171800, end = 1179400) head(depth) library(Sushi) par(mar=c(3,5,1,1)) plotBedgraph(depth, "Chr1", chromstart = 1171800, chromend = 1179400, yaxt = "s") mtext("Depth", side = 2, line = 2.5, cex = 1.2, font = 2) labelgenome("Chr1", 1171800, 1179400, side = 1, scipen = 20, n = 5, scale = "Kb")
Get gene informaton from a ballgown object by genes or by genomic regions and return a data.frame in bed-like file format that can be used as input for plotGenes
in the SuShi package
data(rice.bg) unique(geneIDs(rice.bg)) gene_id <- c("MSTRG.181", "MSTRG.182", "MSTRG.183") geneinfo <- getGeneinfo(genes = gene_id, rice.bg) trans <- table(geneinfo$name) # show how many exons each transcript has trans library(Sushi) chrom = geneinfo$chrom[1] chromstart = min(geneinfo$start) - 1e3 chromend = max(geneinfo$stop) + 1e3 color = rep(SushiColors(2)(length(trans)), trans) par(mar=c(3,1,1,1)) p<-plotGenes(geneinfo, chrom, chromstart, chromend, col = color, bheight = 0.2, bentline = FALSE, plotgenetype = "arrow", labeloffset = 0.5) labelgenome(chrom, chromstart , chromend, side = 1, n = 5, scale = "Kb")
Calculate 3S Scores from ballgown object for a given gene. This function can only calculate one gene. Please use function spliceGenome
to obtain genome-wide 3S scores.
data(rice.bg) rice.bg head(geneIDs(rice.bg)) score <- spliceGene(rice.bg, "MSTRG.183", junc.type = "score") count <- spliceGene(rice.bg, "MSTRG.183", junc.type = "count") ## compare tail(score) tail(count) ## get intron structrue intron <- structure(rice.bg)$intron intron[intron$id %in% rownames(score)]
Calculate 3S scores from ballgown objects for all genes and return a list of two elelments: "score' is matrix of intron 3S scores with intron rows and sample columns and "intron" is a GRanges
object of intron structure.
data(rice.bg) rice.bg splice <- spliceGenome(rice.bg, gene.select = NA, intron.select = NA) names(splice) head(splice$score) splice$intron
Find bimodal distrubition features and divide the samples into 2 groups by k-means clustering and return a matrix with feature rows and sample columns.
data(rice.bg) score <- spliceGene(rice.bg, "MSTRG.183", junc.type = "score") score <- round(score, 2) as <- BMfinder(score, cores = 1) # 4 bimodal distrubition features found ## compare as score[rownames(score) %in% rownames(as), ]
Visualization of read coverage, splicing information and gene information in a gene region. This function is a wrapper of getDepth
, getGeneinfo
, spliceGene
, plotBedgraph
and plotGenes
.
data(rice.bg) rice.bg samples <- paste("Sample", c("027", "102", "237"), sep = "_") bam.dir <- system.file("extdata", package = "vasp") ## plot the whole gene region without junction lables splicePlot(rice.bg, samples, bam.dir, gene = "MSTRG.183", junc.text = FALSE, bheight = 0.2) ## plot the alternative splicing region with junction splicing scores splicePlot(rice.bg, samples, bam.dir, gene = "MSTRG.183", start = 1179000)
If the bam files are provided (bam.dir
is not NA), the read depth for each sample is plotted. Otherwise (bam.dir=NA
), the coserved exons of the samples are displayed by rectangles (an example is the figure in 4. Quick start). And by default (junc.type = 'score'
, junc.text = TRUE
), the junctions (represented by arcs) are labeled with splicing scores. You can change the argument junc.text = FALSE
to unlabel the junctions or change the argument junc.type = 'count'
to label with junction read counts.
splicePlot(rice.bg, samples, bam.dir, gene = "MSTRG.183", junc.type = 'count', start = 1179000)
There are other more options to modify the plot, please see the function ?splicePlot
for details.
sessionInfo()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.