knitr::opts_chunk$set( collapse = TRUE, eval = FALSE, comment = "#>" )
The Orthrus package contains all the computational tools you need to process, score and analyze combinatorial CRISPR screening data.
This document will guide you through the process of scoring a published combinatorial screening dataset. Key features of this dataset are summarized below, and in-depth descriptions of this dataset and how it was originally scored are available in Gonatopoulos-Pournatzis et al.
A more detailed walkthrough of how to apply Orthrus and analyze combinatorial screening data is forthcoming in a separate manuscript.
Orthrus offers three scoring interfaces: manual, batch and wrapper.
Please refer to the following publications for more information on the CHyMErA experimental platform, CRISPR screens and scoring them, or alternative approaches for scoring combinatorial CRISPR screening data.
To follow this vignette, familiarity with CRISPR screening technology is strongly recommended. Familiarity with combinatorial CRISPR screening platforms or other ways to score CRISPR screening data is recommended, but not required.
Install Orthrus and its dependencies if necessary.
# Installs CRAN packages install.packages("ggplot2") install.packages("ggthemes") install.packages("pheatmap") install.packages("PRROC") install.packages("RColorBrewer") # Installs the limma package from Bioconductor if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") BiocManager::install("limma") # Installs Orthrus from Github library(devtools) devtools::install_github("HenryWard/orthrus")
Load packages.
library(orthrus)
Create output folders.
# Renames dataset df <- chymera_paralog # Makes output folders output_folder <- file.path("vignette_output") plot_folder <- file.path(output_folder, "scored") qc_folder <- file.path(output_folder, "qc") lfc_folder <- file.path(qc_folder, "lfc") if (!dir.exists(output_folder)) { dir.create(output_folder, recursive = TRUE) } if (!dir.exists(plot_folder)) { dir.create(plot_folder) } if (!dir.exists(qc_folder)) { dir.create(qc_folder) } if (!dir.exists(lfc_folder)) { dir.create(lfc_folder) }
Call the add_screens_from_table
function to build up a list of screens with names and corresponding technical replicates, starting with T0 replicates.
screens <- add_screens_from_table(chymera_sample_table)
The first thing we want to do is make quality-control plots for raw read counts with the function plot_reads_qc
. Output these to the previously-created QC folder.
plot_reads_qc(df, screens, qc_folder)
Now we need to normalize each screen in three different ways:
The function normalize_screens
automatically performs all of these normalization steps. The function infers which columns of df
need to be normalized to which T0 screens based on the normalize_name
parameter of each screen in screens
(screens without this optional parameter will not be normalized to other screens). Log-scaling and depth-normalization is performed on each screen regardless of the normalize_name
parameter. For example, after normalization T0 columns in df
will contain log-scaled, depth-normalized read counts, whereas columns from later timepoints will contain depth-normalized LFCs compared to their respective T0s.
df <- normalize_screens(df, screens, filter_names = c("HAP1_T0", "RPE1_T0"), min_reads = 30)
Make detailed QC plots for LFC data by calling the plot_lfc_qc
function, specifying the gene names of any negative control guides with the negative_controls
parameter.
plot_lfc_qc(df, screens, qc_folder, display_numbers = FALSE, plot_type = "pdf", negative_controls = c("NT"))
The last thing we need to do before scoring data is parse it into a different structure and split guides by their type, since we score dual-targeting guides separately from combinatorial-targeting guides.
guides <- split_guides(df, screens, "Cas9.Guide", "Cpf1.Guide") dual <- guides[["dual"]] single <- guides[["single"]] paralogs <- guides[["combn"]]
To score data with the batch scoring interface, we call the score_conditions_batch
and score_combn_batch
functions separately.
batch_table <- chymera_batch_table score_conditions_batch(dual, screens, batch_table, output_folder, test = "moderated-t", loess = TRUE, filter_genes = c("NT"), neg_type = "Sensitizer", pos_type = "Suppressor", fdr_threshold = 0.1, differential_threshold = 0.5, plot_type = "pdf") score_combn_batch(paralogs, single, screens, batch_table, output_folder, test = "moderated-t", loess = TRUE, filter_genes = c("NT"), neg_type = "Sensitizer", pos_type = "Suppressor", fdr_threshold = 0.2, differential_threshold = 0.5, plot_type = "pdf")
This concludes a brief walkthrough of how to use Orthrus to score combinatorial CRISPR screening data. However, we left out arguably the most important step: manual sanity-checking and analysis of individual output plots, metrics and scored data. We strongly advise that users check all output files to investigate their data's quality, and revise data processing or analysis steps accordingly (e.g. filtering out T0 reads more strictly, tightening differential effect and FDR thresholds, manually removing problematic guides, ensuring that positive controls are called as significant hits). The accompanying protocol contains guidance for interpreting Orthrus' various output files and suggestions for how to change certain parameters.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.