NucleR is an R/Bioconductor package for working with next generation sequencing and tilling arrays. It uses a novel aproach in this field which comprises a deep profile cleaning using Fourier Transform and peak scoring for a quick and flexible nucleosome calling.
The aim of this package is not providing an all-in-one data analysis pipeline but complement those existing specialized libraries for low-level data importation and pre-processment into R/Bioconductor framework.
NucleR works with data from high-troughput technologies MNase-seq and ChIP-seq, and Tiling Microarrays (ChIP-on-Chip).
This is a brief summary of the main functions:
readBAM
, processReads
, processTilingArray
coverage.rpm
, filterFFT
, controlCorrection
peakDetection
, peakScoring
plotPeaks
syntheticNucMap
This software was published in Bioinformatics Journal: Flores, O., and Orozco, M. (2011). nucleR: a package for non-parametric nucleosome positioning. Bioinformatics 27, 2149–2150.
Follow these instructions to install 'nucleR' from Bioconductor repository in a linux system:
# i.e., for Ubuntu distributions:
apt-get install libcurl4-gnutls-dev
# i.e., for openSuse distributions:
yast2 -i libcurl-devel
(!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install('dplyr')
BiocManager::install('IRanges')
BiocManager::install('GenomicRanges')
BiocManager::install('ShortRead',ask = FALSE)
BiocManager::install('doParallel')
BiocManager::install('ggplot2')
BiocManager::install('magrittr')
BiocManager::install('nucleR')
Alternatively, build the R package from the source code deposited in this repository:
git clone https://github.com/nucleosome-dynamics/nucleR.git
tar -czvf nucleR.tar.gz nucleR
install.packages("nucleR.tar.gz", repos = NULL)
This is an example of the main steps followed to analyse nuclesome positioning data with nucleR. For more details about the functions, description of the example data, how to upload your data and additional analyses refer to nucleR manual and vignette.
1- Load the package in R
library(nucleR)
2- Load the example data provided with the package, containing position of MNase-seq reads mapped to S. cerevisiae genome
data(nucleosome_htseq)
class(nucleosome_htseq)
nucleosome_htseq
3- Filter reads and remove noise: discard the reads longer than 200bp (threshold given to only keep mononucleosomes), remove noise due to MNase efficiency by trimming reads to use only its central part (50bp around the dyad)
reads_trim <- processReads(nucleosome_htseq, type="paired", fragmentLen=200, trim=50)
4- Obtain the normalized coverage (the count of how many reads are mapped to each position, divided by the total number of reads and multiplied by one milion)
cover_trim <- coverage.rpm(reads_trim)
5- Smooth the coverage signal using the Fast Fourier Transformation
fft_ta <- filterFFT(cover_trim, pcKeepComp=0.01, showPowerSpec=TRUE)
6- Detect peaks in the smoothed coverage which correspond to nucleosome dyads and score them according to their fuzziness level
peaks <- peakDetection(fft_ta, threshold="25%", score=TRUE, width=147)
This repository builds on the original nucleR package, written by Oscar Flores.
TODO Add other old functions Add tests * Test on windows
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.