README.md

pipelineNGS

pipelineNGS is a package for processing epigenomic high-throughput data, specifically histone mark ChIP-seq and ATAC-seq.

Getting started

Programs you need to have installed and included in your path:

As this package is a wrapper for some command line tools, you need to have this programs in your $PATH. If they are not in your $PATH, you can also provide the path to the binary files using the appropriate arguments.

Files you need to have in your local machine:

Additionally, you will need to download reference files to perform the different steps in the pre-processing pipeline:

Installing pipelineNGS in your local machine

Open your R session, install the devtools package if it is not already in your machine and type the following:

# Install pipelineNGS package
devtools::install_github("mireia-bioinfo/pipelineNGS")

# Load pipelineNGS package
library(pipelineNGS)

Pipeline Overview

In this package we currently have implemented the pipelines for analyzing the following experiments:

In the following figure you can see a description of the steps needed for the analysis of each type of experiment, with specific arguments (if any) used in the different steps.

Here is an example on how to run a ChIP-seq analysis with single-end data.

## General parameters
index <- "/vault/refs/indexes/hg38"
blacklist <- "/vault/refs/Blacklist/lists/hg38-blacklist.v2.bed"

## Example Single End ##
fastq_files <- c("fastq/sample1_L1.fastq.gz", "fastq/sample1_L2.fastq.gz",
                 "fastq/sample1_L3.fastq.gz", "fastq/sample2_L2.fastq.gz",
                 "fastq/sample3_L1.fastq.gz", "fastq/sample3_L3.fastq.gz")

## Convert to list to use as input for process_epigenome()
# Create one list element for each simple
names <- sapply(strsplit(basename(fastq_files), "_"), function(x) x[1])
fastq_input <- split(fastq_files, names)
fastq_input

## Using the files described in the previous chunk:
process_epigenome(fastq_files=fastq_input,
                  out_name=names(fastq_input),
                  run_fastqc=TRUE,
                  seq_type="CT",
                  type="PE",
                  index=index,
                  blacklist=blacklist,
                  cores=6)


mireia-bioinfo/pipelineNGS documentation built on Jan. 2, 2023, 11:18 a.m.