atacr is a package for creating statistics and diagnostic plots for short read sequence data from capture enriched RNAseq and ATACseq experiments.

This vignette provides a brief overview of the capabilities of atacr.

Sample data

The function simulate_counts() will give us a small simulated data set of three replicates from a control and treatment. Each of the six sets of counts follows a mixed distribution of 10 counts drawn from a log-normal distribution with logmean 4 and SD 1, and 40 counts with logmean 10 and SD 1. This mimics the enrichment pattern we see with capture enriched data. 10 of the counts are multiplied by a value drawn from the normal distribution with mean 2 and SD 1 so can appear differentially expressed. These counts represent bait-windows - regions of the genome for which baits were designed. The bait-window counts are mixed with 50 non-bait-windows which have 0 counts.

library(atacr)
counts <-simulate_counts()

Experiment Summary Information

It's very easy to get information on the coverage for bait/non-bait windows on a per sample basis

summary(counts)
plot(counts)

These plots can be generated individually with the following functions

coverage_summary(counts)
chromosome_coverage(counts)

QC Plots

Plot for coverage by sequence and sample

plot_count_by_chromosome(counts)

Correlations between sample counts

sample_correlation_plot(counts)

Count windows below a threshold.

windows_below_coverage_threshold_plot(counts, which = "bait_windows", from=0, to=1000)

MA plots

ma_plot(counts)

Normalisation

Normalisation strategies are easy to implement with atacr and helpful functions are included

counts$library_size_normalised <- library_size_normalisation(counts)
ma_plot(counts, which = "library_size_normalised")

Normalisation by control windows. Requires a text file with the control window positions

window_file <- "control_windows.txt"
counts$control_window_normalisation <- control_window_normalise(sim_counts, window_file)

Detect differentially expressed/accessible windows

Using a simple bootstrap t-test method for simple two-way comparisons.

result <- estimate_fdr(sim_counts, "treatment", "control", which = "bait_windows")

pander::pandoc.table(head(result))

This can also be done for multiclass designs with multiple samples against a common reference.

multi_result <- estimate_fdr_multiclass(sim_counts, "control", which = "bait_windows")
pander::pandoc.table(head(multi_result))


TeamMacLean/atacr documentation built on May 9, 2019, 4:24 p.m.