getdeg | R Documentation |
getdeg was specifically designed to derive the effect of gene knockout on cell growth based on results from pooled CRISPR-Cas9 experiments. Using a combination of both rate ratios and (assumed or estimated) maximum population doublings, the straight lethality and optionally sensitization / synthetic lethality are calculated based on the "most efficacious guide targeting the gene", i.e. the feature that shows the most extreme rate ratio change within its group.
getdeg( guides, r0, r1, rt = FALSE, a, b, secondbest = TRUE, skipcutoff = FALSE, correctab = TRUE )
guides |
Character vector. Guides are assumed to start with the gene name, followed by an underscore, followed by a number or sequence unique within that gene. |
r0 |
Numeric vector. Log2-transformed rate ratios of features representing straight lethality. |
r1 |
Numeric vector. Log2-transformed rate ratios of features representing sensitization or synthetic lethality. Optional but required to calculate e. |
rt |
Numeric vector. Log2-transformed rate ratios of features representing lethality in the test sample. Optional. |
a |
Numeric. Estimated potential population doublings between time points. |
b |
Numeric. Estimated potential population doublings between time points in test sample. Only applicable if r1 is given. If omitted, assumed equal to a. |
secondbest |
Logical. If TRUE, calculate effect sizes based on the second best guides of each gene as well. Default = TRUE |
skipcutoff |
Logical or numeric. If specified, do not calculate effect sizes of genes with maximum absolute rate ratios below this cut-off. Default = FALSE |
correctab |
Logical. When |
getdeg derives gene knockout effect sizes based on rate ratios. These in turn are derived from sequencing coverage of the features (e.g. guides in a CRISPR-bases screen). The function expects log2-transformed rate ratios. It also requires an estimate of potential population doublings, basically meaning that cells without successful knockout (or knockout of a gene without any growth-modifying effect) would have divided this many times. An example: the log2 rate ratio r0 of a guide between t1 and t0 is -3, and the screen encompassed a = 6 doublings. If the guide was successful in all cells (g = 1), the effect of the corresponding gene knockout is d = r/a = -0.5. The function assumes the guide with the most extreme r with the same direction as the median r of all guides targeting that gene has this efficacy of 1, and then calculates g for the other guides. Things get more interesting when there is also a treatment effect. In this case it compares rate ratios of treated versus untreated and t1 versus t0. From these it will decide which is the best guide and calculate both straight lethal effect d and sensitizing effect e. Although this will generally improve robustness, it is always a good idea to compare the results with the analysis of a single rate ratio measure. Optionally, but by default, the effects based on the second-best guide are also calculated. This function does not do anything in terms of statistics. It expects precautions are taken in the calculation of rate ratios!
Returns a list with the following (depending on input arguments):
genes - list of all gene symbols
n - number of guides representing the gene
d - gene knockout effects on straight lethality
d2 - gene knockout effects on straight lethality based on the second-best guide
e - gene knockout effects on sensitization
e2 - gene knockout effects on sensitization based on the second-best guide
de - gene knockout effects on straight lethality in the test arm
de2 - gene knockout effects on straight lethality in the test arm based on the second-best guide
g - estimated guide efficacy
i - within-gene index of the best guide
j - within-gene index of the second-best guide
When comparing two experimental arms that have had different numbers of
population doublings, things get quirky. I have put the mathematical
correction in the function, which you can turn off with correctab =
FALSE
. I have noticed that correction gives straight lethal genes
artificially high treatment resistance (positive e). But when I do not
correct, I see a downward skew here. In case of a resistance screen, it may
be more useful to look at the uncorrected variant. If you are interested in
picking up sensitizers, I would recommend correcting.
Jos B. Poell
CRISPRsim
, jar
, radjust
ut <- CRISPRsim(5000, 4, a = c(3,3), allseed = 1, perfectseq = TRUE) tr <- CRISPRsim(5000, 4, a = c(3,3), e = TRUE, allseed = 1, perfectseq = TRUE) cgi <- tr$d > -0.05 & tr$d < 0.05 & tr$e > -0.05 & tr$e < 0.05 r0 <- jar(ut$t6, ut$t0, normsubset = cgi) r1 <- jar(tr$t6, ut$t6, normsubset = cgi) deg <- getdeg(ut$guides, r0, r1, a = 6, b = 6, secondbest = FALSE) reald <- rle(tr$d)$values reale <- rle(tr$e)$values plot(reald, deg$d) plot(reale, deg$e)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.