In erikschild/TomoQC: Simple QC for tomo-seq

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  message = FALSE)

TomoQC

A quick quality check for tomosequencing data

Overview

The goal of TomoQC is to provide a quick, simple quality control to assess the quality of a tomosequencing sample. Read, UMI, and Transcript count tables are required for the function to run. Note: if using CelSeq2 primers, first select the 96 columns used out of the 384 CelSeq2 primers. The function can only run on data with 96 columns.

Installation

TomoQC is available to install from github:

# install.packages("devtools")
devtools::install_github("erikschild/TomoQC")

Example

The package includes a 1000 gene dummy dataset which gives an idea of how output may look. Note that in a real experiment, the data input would be based on many more genes.

Oversequencing

Plots a histogram of UMIs/reads per gene. Any occurrence of a value >1 indicates multiple reads originating from the same RNA molecule. The further the peak of the histogram shifts to the right, the more saturated sequencing depth is.

Spike-ins

Plots the percentage of all reads in a column mapping to ERCC spike-ins. A high percentage likely means no sample was present, and vice versa (lower = better).

Unique genes

The amount of unique mapped genes per column. More unique genes are expected to map where sample was present (higher = better).

library(TomoQC)
example <- tomo_quality(transcripts = example_data$transcripts,
                        reads = example_data$reads,
                        umis = example_data$UMIs,
                        cutoff_spike = 20,
                        cutoff_genes = 90,
                        plot_title = "Example output")
example