getdeg: Derive growth-modifying effect of gene knockout in pooled...

Description Usage Arguments Details Value Note Author(s) See Also Examples

View source: R/CSSA.R


getdeg was specifically designed to derive the effect of gene knockout on cell growth based on results from pooled CRISPR-Cas9 experiments. Using a combination of both rate ratios and (assumed or estimated) maximum population doublings, the straight lethality and optionally sensitization / synthetic lethality are calculated based on the "most efficacious guide targeting the gene", i.e. the feature that shows the most extreme rate ratio change within its group.


getdeg(guides, r0, r1, rt = FALSE, a, b, secondbest = TRUE,
  skipcutoff = FALSE, correctab = TRUE)



Character vector. Guides are assumed to start with the gene name, followed by an underscore, followed by a number or sequence unique within that gene.


Numeric vector. Rate ratios of features representing straight lethality.


Numeric vector. Rate ratios of features representing sensitization or synthetic lethality. Optional but required to calculate e.


Numeric vector. Rate ratios of features representing lethality in the test sample. Optional.


Numeric. Estimated potential population doublings between time points.


Numeric. Estimated potential population doublings between time points in test sample. Only applicable if r1 is given. If omitted, assumed equal to a.


Logical. If TRUE, calculate effect sizes based on the second best guides of each gene as well. Default = TRUE


Logical or numeric. If specified, do no calculate effect sizes of genes with maximum absolute rate ratios below this cut-off. Default = FALSE


Logical. When a != b, it is be possible (and necessary?) to mathematically correct for this difference. If you analyze an experiment with unequal a and b, try both with and without correction and read the notes below. Default = TRUE


getdeg derives gene knockout effect sizes based on rate ratios. These in turn are derived from sequencing coverage of the features (e.g. guides in a CRISPR-bases screen). The function expects log2-transformed rate ratios. It also requires an estimate of potential population doublings, basically meaning that cells without successful knockout (or knockout of a gene without any growth-modifying effect) would have divided this many times. An example: the log2 rate ratio r0 of a guide between t1 and t0 is -3, and the screen encompassed a = 6 doublings. If the guide was successful in all cells (g = 1), the effect of the corresponding gene knockout is d = r/a = -0.5. The function assumes the guide with the most extreme r with the same direction as the median r of all guides targeting that gene has this efficacy of 1, and then calculates g for the other guides. Things get more interesting when there is also a treatment effect. In this case it compares rate ratios of treated versus untreated and t1 versus t0. From these it will decide which is the best guide and calculate both straight lethal effect d and sensitizing effect e. Although this will generally improve robustness, it is always a good idea to compare the results with the analysis of a single rate ratio measure. Optionally, but by default, the effects based on the second-best guide are also calculated. This function does not do anything in terms of statistics. It expects precautions are taken in the calculation of rate ratios!


Returns a list with the following (depending on input arguments):


When comparing two experimental arms that have had different numbers of population doublings, things get quirky. I have put the mathematical correction in the function, which you can turn off with correctab = FALSE. I have noticed that correction gives straight lethal genes artificially high treatment resistance (positive e). But when I do not correct, I see a downward skew here. In case of a resistance screen, it may be more useful to look at the uncorrected variant. If you are interested in picking up sensitizers, I would recommend correcting.


Jos B. Poell

See Also

CRISPRsim, jar, radjust


ut <- CRISPRsim(5000, 4, a = c(3,3), allseed = 1)
tr <- CRISPRsim(5000, 4, a = c(3,3), e = TRUE, allseed = 1)
cgi <- tr$d > -0.05 & tr$d < 0.05 & tr$e > -0.05 & tr$e < 0.05
r0 <- jar(ut$t6, ut$t0, normsubset = cgi)
r1 <- jar(tr$t6, ut$t6, normsubset = cgi)
deg <- getdeg(ut$guides, r0, r1, a = 6, b = 6, secondbest = FALSE)
reald <- rle(tr$d)$values
reale <- rle(tr$e)$values
plot(reald, deg$d)
plot(reale, deg$e)

tgac-vumc/CSSA documentation built on Dec. 14, 2019, 9:40 p.m.