SINBAD: A pipeline for processing SINgle cell Bisulfite sequencing samples and Analysis of Data

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)

SINBAD is an R package for processing single cell DNA methylation data. It accepts FASTQ files as input, performs demultiplexing, adapter trimmming, mapping, quantification, dimensionality reduction and differential methylation analysis for single cell DNA methylation datasets.

NOTE: SINBAD is tested on paired snmC-Seq data.

System requirements

R 3.6.0 or later version is required for installation.

Installation

To install SINBAD, type the following command in R command prompt:

devtools::install_github("yasin-uzun/SINBAD")

Once you have installed the SINBAD, verify that it is installed correctly as follows:

SINBAD::test()

If SINBAD is installed without any problems, you should see the following message:


>SINBAD installation is ok.

Dependencies

To run SINBAD, you need to have the underlying software:

Note that you only need the tools you will use to be installed, i.e, you don't need BSMAP or BS3 if you will only use Bismark as the aligner.

You can install these tools by yourself. For convenience, we provide the binaries in here . Please cite the specific tool when you use it, in adition to MethylPipe.

You can download demultiplex_fastq.pl script from here.

You also need genomic sequence and annotated genomic regions for quantification of methylation calls. We provide the sequence data for hg38 assembly in here.

Graphical User Interface

SINBAD has an easy to use Graphical User Interface (GUI). Detailed instructions for the GUI are available in the SINBAD User Manual.

Configuration

To run SINBAD, you need three configuration files to modify:

You can download the templates for the configuration files from here and edit them for your purposes.

Running

SINBAD is run in two steps:

  1. Read configuration files:
read_configs(config_dir)

config_dir should point to your configuration file directory (mentioned above).

  1. Process data:
process_sample_wrapper(raw_fastq_dir, demux_index_file, working_dir, sample_name)

This function reads FASTQ files, demultiplexes them into single cells, performs filtering, mapping (alignment), DNA methylation calling and quantification, dimensionality reduction, clustering and differential methylation analysis for the given input. All the outputs are placed into related directories in working_dir.

Example Data

For testing SINBAD, we provide sample raw read (FASTQ) data

Citation

If you use SINBAD in your study, please cite it as follows:

SINBAD: A pipeline for processing SINgle cell Bisulfite sequencing samples and Analysis of Data , GitHub, 2021.

Contact

For any questions or comments, please contact Yasin Uzun (uzuny at email chop edu)



yasin-uzun/SINBAD documentation built on March 20, 2022, 11:48 p.m.