Description Usage Arguments Details Value Author(s) Examples
This function identifies the Core Fitness genes from a given Quantative knockout screen dependency matrix where each row is gene and each column the cell line. The function uses all the cell lines and identifies the genes that are essential in majority of the cell lines.
1 2 3 | ADAM2.SlopeCF(depMat,
display=TRUE,
prefix='SlopeMethod')
|
depMat |
Quantative knockout screen dependency matrix where rows are genes and columns are samples. A real number in position [i,j] represents the strength of dependency which indicates the amount of loss of fitness in the j-th sample in case of the inactivation of the i-th gene. Higher strength of dependency indicates higher probability of beign a core fitness gene. These values are used for ranking the genes wrt to gene effect scores. |
display |
Boolean, default is TRUE. Should bar plots of the dependency profiles be plotted |
prefix |
if the display is false the plots are generated in the working directory using the prefix. |
This function implements the idea that if a gene is essential then it should have ranked better in all cell lines including the least dependent cell lines. Instead of calculating rank threshold the ranks are modeled as a linear relation. For a given gene, we can rank its gene effect score in each cell line, then arrange cell lines in order of increasing gene effect score for that gene. The ranks of the genes in these cell lines are fitted to a linear model where smaller slope values indicates higher dependency in all the cell lines. A slope threshold is choosen as the minimum density between the two gaussian distributions that are estimated from the distribution of slopes. All genes with a slope less than this threshold is reported. Notice that we do not need to put a constraint such as 90th percentile least depleated cell lines.
A list of the following vectors:
cfgenes |
Vector of number of genes that are core fitness genes |
LeastDependent |
A dataframe where each row corresponds to a gene.There are two columns: Value stores the slope of linear model that fits the rank of the gene in allcell lines and the Gene stores the gene name |
threshold |
The slope threshold for core fitness genes |
C. Pacini, E. Karakoc & F. Iorio
1 2 3 | data(exampleSBFData)
results <- ADAM2.SlopeCF(depMat=exampleSBFData)
cfgenes <- results$cfgenes
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.