ALTRE: Workflow: Post-alignment to altered TSS-proximal and...

Description Details

Description

The ALTRE workflow takes aligned reads and peak/hotspot calls from assays of open chromatin (e.g. ATAC-seq, DNAse-seq) and identifies regions (e.g. TSS-proximal and TSS-distal) that differ based on cell and tissue type.

Details

ALTRE requires sample information CSV file, peak files (bed format), and alignment bam files (BAM) as input. All input files need to be in the same folder.

Workflow Steps: The order in which the functions should be used are defined below (Click on function to get more detailed information). For a detailed vignette, go to https://mathelab.github.io/ALTRE/vignette.html.

  1. loadCSVFile

    Reads in a sample information file in CSV format and outputs a data frame.

  2. loadBedFiles

    Takes in a sample information data frame (output of loadCSVFile), loads the peak files, and outputs a GRangesList object that holds all peaks for each sample type.

  3. getConsensusPeaks

    Takes in a GRangesList object with peaks (output from loadBedFiles in step 3), and outputs consensus peaks. Consensus peaks are those present in at least N replicates. The function outputs a list containting a GRanges object with the consensus peaks, and a dataframe with some statistics. A barplot summary of the number of consensus peaks and those in each replicate can generated with plotConsensusPeaks().

  4. combineAnnotatePeaks

    Peaks for each sample type (from the GRanges object output from previous function) are combined and annotated with type specificity (which cell types the hotspot is present in) and whether each region represented in the GRanges is TSS-proximal (default: <1500bp from a transcription start site) or TSS-distal (>1500bp from a transcription start site). The function can also merge regulatory elements that are within a specified distance from each other. This function requires the annotation of transcription start sites (e.g. to retrieve from Ensembl, run TSS <- getTSS())). To view the number and length of peaks before/after merging, use the function plotCombineAnnotatePeaks().

  5. getCounts

    The number of reads overlapping annotated regulatory elements from the previous steps are counted. To view a density plot of the sizes of the regions, use the function plotGetCounts().

  6. countanalysis

    This function runs differential analysis, using DESeq2, to identify significantly altered regulatory elements (TSS-proximal or TSS-distal).

  7. categAltrePeaks

    This function allows the user to modify cutoffs for p-values and log fold changes to define altered or shared regulatory elements. A volcano plot (log fold changes vs -log p-values), highlighting significantly altered regulatory elements, can be viewed by running the function plotCountAnalysis(). To view the distribution of counts (as FPKM) in each type of regulatory element (e.g. TSS-proximal shared/experiment-specific/reference-specific), use the function plotDistCountAnalysis().

  8. comparePeaksAltre

    This function compares the number of regulatory elements identified as altered or shared between two sample types. The two methods compared are: 1) using peak presence and associated intensity (e.g. amount of chromatin accessibility); 2) using peak presence only as determined by peak/hotspot caller. To visualize differences in the number of regulatory elements called as specific or shared, use the function plotCompareMethods().

  9. runGREAT and processPathways

    Determines which pathways are overrepresented in altered TSS-proximal or TSS-distal regions using GREAT. By default, Gene Ontology pathway annotations are used. To visualize the pathways that are enriched, use the function plotGREATenrich().


Mathelab/ALTRE documentation built on May 7, 2019, 3:41 p.m.