knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) suppressPackageStartupMessages({ library(GenomicRanges) library(ggplot2) library(magrittr) library(VplotR) })
VplotR is an R package streamlining the process of generating V-plots, i.e. two-dimensional paired-end fragment density plots. It contains functions to import paired-end sequencing bam files from any type of DNA accessibility experiments (e.g. ATAC-seq, DNA-seq, MNase-seq) and can produce V-plots and one-dimensional footprint profiles over single or aggregated genomic loci of interest. The R package is well integrated within the Bioconductor environment and easily fits in standard genomic analysis workflows. Integrating V-plots into existing analytical frameworks has already brought additional insights in chromatin organization (Serizay et al., 2020).
The main user-level functions of VplotR are getFragmentsDistribution()
,
plotVmat()
, plotFootprint()
and plotProfile()
.
getFragmentsDistribution()
computes the distribution of fragment sizes
over sets of genomic ranges;plotVmat()
is used to compute fragment density and generate V-plots;plotFootprint()
generates the MNase-seq or ATAC-seq footprint at a
set of genomic ranges.plotProfile()
is used to plot the distribution of paired-end fragments
at a single locus of interest.VplotR can be installed from Bioconductor:
if(!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") BiocManager::install("VplotR") library("VplotR")
importPEBamFiles()
functionPaired-end .bam files are imported using the importPEBamFiles()
function as follows:
library(VplotR) bamfile <- system.file("extdata", "ex1.bam", package = "Rsamtools") fragments <- importPEBamFiles( bamfile, shift_ATAC_fragments = TRUE ) fragments
Several datasets are available for this package:
data(ce11_proms) ce11_proms data(ATAC_ce11_Serizay2020) ATAC_ce11_Serizay2020
data(ABF1_sacCer3) ABF1_sacCer3 data(MNase_sacCer3_Henikoff2011) MNase_sacCer3_Henikoff2011
A preliminary control to check the distribution of fragment
sizes (regardless of their location relative to genomic loci) can be
performed using the getFragmentsDistribution()
function.
df <- getFragmentsDistribution( MNase_sacCer3_Henikoff2011, ABF1_sacCer3 ) p <- ggplot(df, aes(x = x, y = y)) + geom_line() + theme_ggplot2() p
Once data is imported, a V-plot of paired-end fragments over loci of
interest is generated using the plotVmat()
function:
p <- plotVmat(x = MNase_sacCer3_Henikoff2011, granges = ABF1_sacCer3) p
The generation of multiple V-plots can be parallelized using a list of parameters:
list_params <- list( "MNase\n@ ABF1" = list(MNase_sacCer3_Henikoff2011, ABF1_sacCer3), "MNase\n@ random loci" = list( MNase_sacCer3_Henikoff2011, sampleGRanges(ABF1_sacCer3) ) ) p <- plotVmat( list_params, cores = 1 ) p
For instance, ATAC-seq fragment density can be visualized at different classes of ubiquitous and tissue-specific promoters in C. elegans.
list_params <- list( "Germline ATACseq\n@ Ubiq. proms" = list( ATAC_ce11_Serizay2020[['Germline']], ce11_proms[ce11_proms$which.tissues == 'Ubiq.'] ), "Germline ATACseq\n@ Germline proms" = list( ATAC_ce11_Serizay2020[['Germline']], ce11_proms[ce11_proms$which.tissues == 'Germline'] ), "Neuron ATACseq\n@ Ubiq. proms" = list( ATAC_ce11_Serizay2020[['Neurons']], ce11_proms[ce11_proms$which.tissues == 'Ubiq.'] ), "Neuron ATACseq\n@ Neuron proms" = list( ATAC_ce11_Serizay2020[['Neurons']], ce11_proms[ce11_proms$which.tissues == 'Neurons'] ) ) p <- plotVmat( list_params, cores = 1, nrow = 2, ncol = 5 ) p
Different normalization approaches are available using the normFun
argument.
normFun = 'none'
. # No normalization p <- plotVmat( list_params, cores = 1, nrow = 2, ncol = 5, verbose = FALSE, normFun = 'none' ) p
# Library depth + number of loci of interest (default) p <- plotVmat( list_params, cores = 1, nrow = 2, ncol = 5, verbose = FALSE, normFun = 'libdepth+nloci' ) p
# Zscore p <- plotVmat( list_params, cores = 1, nrow = 2, ncol = 5, verbose = FALSE, normFun = 'zscore' ) p # Quantile p <- plotVmat( list_params, cores = 1, nrow = 2, ncol = 5, verbose = FALSE, normFun = 'quantile', s = 0.99 ) p
VplotR also implements a function to profile the footprint from MNase or ATAC-seq over sets of genomic loci. For instance, CTCF is known for its ~40-bp large footprint at its binding loci.
p <- plotFootprint( MNase_sacCer3_Henikoff2011, ABF1_sacCer3 ) p
VplotR provides a function to plot the distribution of paired-end fragments over an individual genomic window.
data(MNase_sacCer3_Henikoff2011_subset) genes_sacCer3 <- GenomicFeatures::genes(TxDb.Scerevisiae.UCSC.sacCer3.sgdGene:: TxDb.Scerevisiae.UCSC.sacCer3.sgdGene ) p <- plotProfile( fragments = MNase_sacCer3_Henikoff2011_subset, window = "chrXV:186,400-187,400", loci = ABF1_sacCer3, annots = genes_sacCer3, min = 20, max = 200, alpha = 0.1, size = 1.5 ) p
sessionInfo()
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.